Wednesday, 5 July 2023

Sustainable Coding, and how do I apply it to myself as a Cloud engineer?

 I work as a developer of a Cloud service, Big Animal - EDB's Cloud Postgres product. So I went along to a meetup the other day, a panel discussion on Leveraging Cloud Computing for Increased Sustainability

It got me thinking about this whole issue, and how in a practical sense I could do anything that might reduce the carbon footprint of the service I work on. 

The conclusion I came to was that I don't really know ... and to some extent neither did the panel, so Cloud computing may give you some fancy tools to help assess these things, such as Microsoft Sustainability manager. But there are no black and white answers as to what would make something more sustainable - even the basic one of - run it in the cloud or on prem,  very much depends on what you are running and how. For one or other to work out as the more sustainable.

So on a global scale just how significant is computing as a percentage of global energy consumption and emissions?

The Cloud Climate Issue

Comparing today with 30 years ago is useful in terms of seeing where we are going...


1990s vs 2020s IT as a proportion of global energy and emissions

  • 1990s Energy 5% (Most from Office desktop computers and CRTs)   - 2% emissions
  • Today 8% (Most personal devices, laptops and mobile, includes 2% data centres) - 3% emissions

  • Compute power / storage  is around 30,000 times greater (by Moore's Law)
  • Data 16 Exabytes (EB) has grown to 10,000 EB so 600 times, with the majority in the last 3 years

Today data centres (hence the Cloud) is causing 2% emissions, as much as the whole of IT in 1990 and as much as today's aviation industry.

So working as a cloud engineer looks like a poor choice for someone concerned about climate change!

But on the face of it we have been pretty efficient, our compute and storage has massively increased, yet consumption + emissions only by 50%. But the issue is the acceleration in usage, which means we could double energy and emissions in 20 years, if nothing was done to improve sustainability.

The increase in compute power has remained fairly consistent since the advent of the transistor making Moore's Law, more a law of Physics than of human behaviour. Although of course that technology is now at its limits of miniaturisation. So the energy and emissions consumed per Gigaflop of compute has drastically dropped - but now everyone has the compute power of a supercomputer in their pockets.
The first supercomputer to reach 1 GFlop was the Cray in the 80s, by the 90s an IBM 300 GFlops supercomputer beat Gary Kasparov at chess - today a Google Pixel 7 phone is 1200 GFlops.
Hence our consumption has rather outstripped our increase in compute.

But it is the explosion in data that is a story of human behaviour.  Hand in hand we have reduced costs of cloud storage and monetised personal data. With software companies valued based on how many customers, and more importantly how much customer data they have. Recent advances in AI have proved the value of big data lakes for training models to produce practical ML applications. 

Combine that with the problem of induced demand. The more and bigger roads you build, the more traffic you get. Cloud puts a 6 lane highway outside everybody's front door.

How do we measure sustainability

So within the world of commercial sustainability, and carbon off setting, there is a basic concept  to categorize things as scope 1-3 emissions.

  1. Scope 1 covers emissions from sources that a company owns or controls directly.
  2. Scope 2 are emissions that a company causes indirectly and come from where the energy for the services it purchases and uses is produced.
  3. Scope 3 encompasses everything else. So suppliers energy use, etc.
The assumption is that raw energy consumption is not the issue. It is the generation of climate changing emissions to generate that energy that is the metric.  

This includes mining for minerals to build laptops and data centres, etc. But if you run your own green energy solar farm next to your data centre, and that is direct powering, without any significant battery storage. Plus feeding energy back to the grid, you can be pretty much carbon neutral. You can also fund renewable energy projects and offset. 

So strangely perhaps, given how full cooling can treble hardware life span. The biggest data centres in the world are currently built in the world's desserts rather than at the north pole. Solar and wind can be relied upon for more than 100% power.
  • Microsoft Azure was carbon neutral in 2012. It is aiming for 2030 for its whole business (then for 2050 to removing all its carbon debt since it was founded in 1975)
  • Google Cloud became carbon neutral for all its data centres in 2017, also aiming for 2030 for all its business.
  • Amazon is aiming for AWS cloud neutral by 2025, and as a global retail supplier to do the same for its whole business by 2040.
Of course this is not possible in most European countries, so most carbon neutral data centres in Europe will be from purchasing carbon neutral generated energy, rather than actually being neutral in themselves. Although some go a long way down that road, to partner with renewable energy suppliers and tick a number of other sustainability boxes. The problem is if data centres are buying up lots of the renewable energy supply at a premium, then that means they are removing it from residential or other uses. So this is hardly helping global sustainability and in reality means they are far from neutral.

Also carbon neutral means only that scope 1 is covered. Net zero is a standard above carbon neutral where to deal with scope 2 and 3, emissions must be taken out of the atmosphere. So that in practise only a net zero supplier is actually contributing nothing to climate change.  No cloud provider is net zero.

A key point is that the latest enormous scale cloud provider data centres are not the main source of emissions, it is all the older, smaller, more local data centres and machine rooms of servers that are causing the majority of the emissions. In the same way that car pollution is disproportionately down to older vehicles. Of course there is the manufacturing footprint to consider for cars that can last 40 years, but all computer hardware has a much shorter lifespan of 3-5 years. Obsolescence makes increasing the lifespan uneconomic. Another green issue that could fill a blog post on its own.

So moving to cloud provider's services and migrating any remaining on prem to the cloud is the sustainable thing to do, as long as what is moved is suited to cloud, or can be re-architected for the cloud.

What changes, as a developer, could improve sustainability?

Life Style

So the obvious thing that people think of, is the nature of their employer's work. Or perhaps if your company is a B2B one, do they have green standards wrt. the clients that they work with. For example it may not make sense working for ExxonMobil as the company with the world's largest emissions. Perhaps the tech industry equivalent would be working on Crypto currency? But Blockchain developers are working on that reputation, even coming up with useful uses for it!, such as auditing sustainability usage for scope 2 and 3 verification. 

Over half of internet traffic these days is video streaming, so stop watching Netflix and scrolling on TikTok - and read or listen to books instead is maybe a good behaviour 😉
On the plus side porn has dropped from its high as 25% of internet traffic down to around 10%, but it has been more than replaced by cat and side hustle millionaire videos it seems. So if your side hustle is being a prolific social YouTuber, it may not be the most ecological of life choices. Since an hour long short story of digital text is 100 Kb whilst the same as a 4k video is a million times bigger at 10 Gb.

On a personal level, my previous employer was more office orientated. So it was keen to encourage people into the office with free food etc. it encouraged commuting in to work, and the maintenance of offices with permanent desk space for every employee, monitors, heating etc. and all the unnecessary extra emissions that entails. My current one is more remote-first.

In terms of remote work, having experienced pandemic lock downs in a city. I was going out for a regular cycle for exercise, so I can confirm that the reduction in emissions may have only been measured at 20% across the whole of Britain, but in the cities it felt more like 50% - the air was so much more breathable.Whilst maximising WFH is not equivalent to pandemic lock downs, it does make a difference. So changing jobs in the tech sector, to a full-time remote position, is certainly a worthwhile contribution to sustainability.

There is the argument that if we all lived alone in big drafty castles which could be turned off for the day by packing into an office a walk away. Then remote working is not more sustainable. But the reality of IT work today, especially with hybrid working, is that the big fairly empty building you are more likely to be in these days, is the office.

So become full time remote if you can. If you have to work for an office based employer then choosing one that has hot desking, smaller offices, less frequent attendance and live within walking or cycling distance, are all part of being sustainable wrt. your tech job.

Sustainability for a Cloud SaaS company

I work for a company that produces a cloud market place software product, with most engineers working remotely and running no servers at all, just employees laptops, ie everything we run is via cloud providers services. We have a few offices globally but only a minority of engineers use them. Since all teams are largely remote, there is no office, no paperwork, commute or physical products.

So the same applies to all our other services, eg. from CI to presentations. From LaunchDarkly to our CRM. From expenses to online mental health support etc. Plus Slack and Zoom for comms.

This is a pretty common model, your could call it a server-less company, it was the same at my previous employer. We sell SaaS and we use it for everything internally too

Therefore the assumption is that the problem of working out 2 and 3 should be solved by those cloud providers, which to some extent it is ... maybe some less than others. But emissions data can be obtained for scope 2 and 3 from them.

So that leaves scope 1. This may be hugely affected by how much face to face sales and marketing goes on etc, but that is not my area. So I am purely going to focus on what options are there to improve sustainability wrt. to the software architecture, development and deployment practises available for producing a cloud based software service, SaaS. Since those are the areas that as a software engineer I can influence.

So lets break that down to some basic elements, and work out what are the more sustainable practises and approaches.

Cloud vs. On Prem

So first things first. Is working for a company that runs everything on cloud, and delivers a cloud based product a good thing, versus writing software for running in a local server room or data centre?

Assuming you use one of the big carbon neutral cloud providers and are using virtualisation to scale capacity efficiently with usage. Then it is likely that a Cloud data centre will be run much more sustainably than a local data centre where you may house your own servers and certainly a local machine room. So even if you are running a specialised HPC data-centre where the majority of traffic is local ... third party providers will be able to offer more sustainable options.

Of course if you software is entirely unsuited to cloud virtualisation (k8s micro-services etc) or badly designed for it, you could actually be running up way more resources than a local monolithic solution on a few dedicated servers would. So sustainability goes all the way down through the architecture to the lines of code, and what they are written in.

A whole load of legacy software dumped onto the cloud can be less sustainable (and way more costly) to run than running it locally. 

So another sustainable employment decision, is to not work for an organisation that either has a lot of legacy software or has its  own servers or data-centres, or at least only if they are bigger than a soccer pitch (ie average DC size or bigger) and have their own adjacent wind farm or other local renewable power source.

But if like my employer, everything is run on the three major cloud providers, and there is very little in the way of scope 2 and 3, then is the sustainable business box ticked already?

Unfortunately not, as mentioned, they are not net zero and ~2% of global emissions are from running data centres, so whilst that may be disproportionately from ones that are not the self powered giant DCs used by the big cloud vendors. Being as efficient as possible wrt. use of Cloud is still the key to being a sustainable tech worker. Especially with the projected growth in Cloud and its emissions being a significant ecological concern.

Choice of software languages


So the reference paper often quoted (and misinterpreted) for software language sustainability is this Portuguese University paper on Energy Efficiency across Programming Languages.

Where we could perhaps regard sustainability as the combined goal of minimising energy = performance, time and memory usage (table 5. in the paper). So leaving out the older / less mainstream languages we have ...
  1. C, Go
  2. Rust, C++
  3. Java, Lisp
  4. C#
  5. Dart, F#, PHP
  6. Javascript, Ruby, Python
  7. TypeScript, Erlang
  8. Lua, JRuby, Perl
So on that basis we should write everything in C or Go or possibly Rust, maybe even Java if we are not that eco-friendly.

Whilst I do use Go for writing Cloud micro-services, I think the paper's focus on executing a few specific algorithmic performance tests is maybe not an entirely representative approach.

I have been a Python developer for 20 years and Python is ranked almost last for speed. 75 times slower than C at the top spot. But even if this were the case across all uses, then it ignores the fact that for compute heavy tasks where Python is employed in number crunching, it uses high performance libraries for the core processing functions. So Numpy is half C and runs all the big matrix manipulations in C.

Hence the API coding and setup is in Python but it is not actually running everything 78 times slower than C, it is running maybe at worst, half the speed of a pure C program. Plus that custom pure C program could well have taken a lot longer to write and be less reusable, so in total use way more energy than a Python version would. Especially for short lived code and Jupyter interactive coding orientated use cases such as used in the science and finance sector. 

There are further optimisation approaches such as Numba, when Python is being used for fast computational use cases which can compile straight to CUDA machine code for GPUs. 
A paper comparing Java, Go & Python for IoT decision making. Similarly puts Go at the top for efficiency, but places Python above Java (presumably Python was using SciKit hence C for performance critical algorithm execution). So clearly the use case and the methodology of the study, can make a huge difference in the measured efficiency.

The same could probably be said for a number of the other languages languishing at the bottom of the table. If measured for executing a real world use case rather than a pure language implementation, the results can be much improved.
However for very nimble light weight micro-services then a directly compiled language like Go is going to use less resources than languages using JIT VMs and/or an interpreter. 

Then there is the core point that most applications in the cloud are not highly intensive calculation based ones. The performance of the majority of applications are more likely to be due to the data I/O on the network between services and storage. Where raw algorithmic performance has little impact.

What does matter is that running up parallel processes is simple and lightweight.
That core feature, along with the simplicity of Go and its small footprint were designed specifically for cloud computing. Which means, becoming a Go programmer, or at least learning it. Is a good choice for the more sustainable programmer.

It is also why ML/Ops will often use Python at the development and testing stages of ML models, but then switch to Go implementations for production.


Software Architecture


The architecture that is deployed to cloud has a huge impact on the efficiency of a cloud service, and hence its sustainability. Certainly it is going to have much more impact on energy wastage than the raw algorithmic performance of the language used.

The architectural nirvana of cloud services are that they are composed of many micro-services, each managing a discrete component of the service's functionality and each able to scale independently to provide a demand driven, auto-scaled service that ramps up and down whatever components are required from it at any given time. Morphing itself to provide always sufficient capacity. Not needing stacks of wasteful hot failover servers running without a job to do. Not getting overloaded at peak and failing to deliver on uptime.
The ideal sustainable use of hardware, always just enough. Virtualisation allowing millions of services to ramp up and down across the shared Cloud provider DCs vast hardware farms.

Clearly, combined with Big Cloud using the latest carbon neutral DCs, this ideal is much more sustainable than each company running its own servers and machine rooms 24/7 on standard grid non-renewable power, for a geo-local service that only approaches full capacity twice a day, and could probably be happily turned off 6 hours a night with nobody noticing.

From this perspective, one the big cloud vendors are keen to promote, Cloud is the sustainable solution not the problem.

Unfortunately that ideal is often very far from the reality.

Software that is essentially monolithic in design can end up being lifted and shifted to the cloud with little refactoring. At best the application is maybe chopped up into a few main macro-services. UI, a couple of backend services and data store as another. Then some work done to allow each to be deployed to Kubernetes as pods with 3 or more instances in production. Ideally the replicas are identical in role and have good load balancing implemented, or multi-master replication for the storage. But often primary-replica is as good as it gets.

Essentially an old redundant physical server installation with a few big boxes to run things is being re-implemented via k8s. Then repeat that per customer, usage domain, geo-zone or whatever sharding is preferred. Big customers get big instance's - the providers have wide sizing ranges for compute, storage etc.

Its better than just setting up a VM to replace each of your customer's on prem boxes - and basically changing almost nothing from on prem installs, but any increased sustainability is only that provided by the Cloud vendor's DCs. The solution is not cloud scale with auto-scaling, its repeated enterprise scale with a lot of fixed capacity in there.

For these cases maybe consider swapping out some elements with a cloud provider scaled service, eg the storage. Whether that is by using the Cloud provider's solution or a third party vendor's market place one.

Even for software that has been freshly written for the cloud there can be architectures that consume excessive resources and are overly complex, some times because of the opposite issue. So with the budget to rewrite for cloud, then developers can leap too fast for all cloud scale solutions - when the service has no need of them. For example deploying multi-region Kafka for event streaming and storage, when data could happily have been sharded regionally and put into a small Postgres or MariaDB cluster. 

Repeatedly firing up a 'micro-service' k8s job that is very short lived but uses a big fat monolith code base, so that 80% of the time and cost of the job is in the startup. This is where language matters more, the lighter and faster the language, the smaller the binary and its memory usage, the better.

The use of gRPC between micro-services provides 10 times the speed of REST, which can be reserved just for the whole service API to the UI and CLI.

One key indicator of waste is the obvious one of cost. If your new cloud deployed application is generating usage costs that work out far more expensive than the TCO for its on prem deployment. Then its architecture is not fit for cloud use. You are burning money and generating excess C02.

Sadly with architecture it all depends what suits the scale and use cases of a service. So there is no simple fix it advice here.

Development, Testing & Release practises


Testing and release are probably the most important area of Cloud software development that could benefit from more sustainable practises. This is perhaps more a pitfall of the rise of Docker and infrastructure as code, rather than Cloud itself, but the promise of replicable automated built software environments has delivered. 

What it has delivered is a development to production life cycle where developers can spin up any number of their own development environments - even one per Jira ticket, automatically built on its creation perhaps.
In order to get merged with the release code your team choose to run the full E2E suite. It takes a little while, but we can speed it up by running the 5 clusters we need in parallel for each test environment case. These also standup the whole environment, load it with fixture data and run E2E tests on it, maybe some infrastructure ones too, that failover the storage and restore from backup.
But at least they should automatically teardown the test clusters at the end, where as dev clusters can hang around for months without cleanup.
Then once it passes it goes out to the dev environment which has its own testing barriers for release to staging. Staging should be the same hardware resourcing as Prod so that it properly tests that it is working for it, perhaps with some load testing or maybe that is done in another set of clusters.
Finally it gets to roll out to production, but maybe for safety prod has a set of canary environments it goes to first, for final validation before it can be rolled out to customers.

So to get 20 lines of code into production. We could easily have a process like the above, that involves spinning up over 10 temporary k8s clusters and uses hundreds of longer life ones. Just running the E2e and infra tests will take over an hour.

This is seen as good practise in the Cloud world. Rigorous testing before release to production. It is pretty common for companies producing a cloud service. Since most software companies now have to produce a cloud version of their product to satisfy the markets then that is a lot of companies. For the first year or so, all this will be run at the cost of millions of dollars, with hardly any customers using it. Because that is what you do. Agile, get the product out, then grow and refine it and the team developing it. Build it and the customers will come.

This is a hugely wasteful process, and it is not far from Crypto in terms of generating emissions, for something that has no practical use yet.

If we do end up with a lot of customers fine, but for services that are not multi-customer architecture, ie big revenue small customer numbers, there may well be customer specific customisations of the product ❄❄snowflake alert❄❄ So the easy option is the duplication of as many clusters in dev and staging as are in prod, to cater for fully testing for those big clients. So a great deal of duplicate resource spend.

So there should be a lot more consideration of sustainability when establishing the above practises for the development to release cycle.

One way to address this issue is to push as much testing as possible down the testing pyramid.
Unit testing is less useful for cloud since the whole point of Cloud and micro-services is to do only one thing and knit together via API calls the full service. Which means there may be very little functionality that can be tested by a unit test, since everything needs to be mocked.

However that doesn't mean that things cannot be faked, fakes allow fast functional testing of micro-services. Fakes can mean the full emulator's of services, eg. Google pub sub. Or running your gRPC services over its test fake, local memory, bufconn.

But the aim should be to establish a full fake test framework that can run up your service on your laptop. Ideally without the need of a k8s fake like kind to stand it up. Since we don't want to fake the deployment environment - just the running code. Functional tests can then be written that can be used like unit tests to check PRs pass in seconds as part of a git workflow. Running those same tests at regular intervals against full deployments can validate that they correctly mimic them.

There should be layers of tests that validate code before the E2E test layer and do not just have unit and E2E, since then the validity of the code relies on full deployment testing. Full deployment testing should just be run as part of the release process,  it should never be run at the PR validation level, it takes too much time and energy.

Developers can have reasonably long lived personal dev clusters not one per PR, maybe even resort to shared dev clusters per team, to reduce spinning excessive amounts of cloud resource for development.
Automated shutdown based on inactivity should be the norm.

Time should be invested in developing good sample data for non production environments. They should not resort to duplicating all customers, regions or whatever sharding. Plus a bunch of test versions of them. If you have more things running in dev than in prod, you are doing things wrong.

Another route to take is to only have production for long lived deployed clusters. With temporary clusters for automated testing and the use of feature flags to cater for final stage testing in production sandboxed feature enabled clusters, prior to full release. So this separates deployment from release - the latter can then be moved outside of engineering, once a flag has passed testing and validation.

Temporary clusters can use tools such as vcluster for automated short lifespan k8s clusters, significantly reducing the resource usage and speeding up the spin up time, for dev clusters. Hundreds of pseudo separate k8s clusters for dev and testing can be run in a single k8s cluster.

Anything else?

The explosion in data is not just all video streaming. Observability is a huge topic, the amounts of telemetry and logging that a well SRE engineered service needs can be overwhelming. Clear management of that, and limits on retention (at least outside of cold - ie tape / optical - storage) are essential. Such things as the ability to turn on higher info debugging levels for very restricted sets of environments. Provide valid ML learning data sets without filling up data lakes of hot storage, etc.

There are still so many more things that impact Cloud sustainability in terms of Cloud applications ... however this blog post is already unsustainably long 😀. So I think I should end it here.

The main point is Cloud can be the sustainable option, but only if cloud engineers put in the effort to make it so, by pushing for the most sustainable architecture, development and release practises in our every day work.





Saturday, 13 May 2023

Ten Software Engineering Managers

Engineering management from the perspective of the managed.

I have worked in the software industry for many years. Working in both the public and private sectors as an individual contributor software developer, SRE and cloud engineer. 
Along the way I have been managed by 15 different managers, along with having work interactions with around another 100. So naming no names I thought maybe it is worth distilling my managed life into a set of software manager caricatures.
To illustrate what makes a good software manager (and a bad one).

I accept that since I have never chosen to become one, then I am criticising a skill I have shown no interest in acquiring myself. I have only partially dabbled in it, via senior IC roles, ie Staff Engineer technical advocacy. However it is a good chance to let off steam ... and possibly a software manager may read this and reflect on which characteristics they may want to work on. So guys its time for your 360!

Remote managers are better

Having worked for many years in offices with my manager sat at the desk behind me, literally looking over my shoulder. I should probably also admit that in recent years I have chosen to work full time remote, ideally with my manager in another time zone, or at least another country. Luckily I got on with those managers that were literally breathing down my neck. But it certainly didn't help me get the job done.

Full time remote is probably not so good an option for those who are just starting their career. However for established engineers it does tend to insulate you from the various forms of toxic management out there and lets you get on with the job. It also requires you to be more explicit about the engineering process, collaboration and documentation, and hence be more productive in a more maintainable manner.

Bullying, micro-management, under-mining, overly personal etc. Although these can all still happen in front of colleagues on Slack and Zoom. But it is easier to shut down a conversation and walk away virtually so that the manager has time to calm down and control their behaviour. 

Manager's skilled at managing remote international teams have to be skilled at targeted succinct and effective communication. Especially if they only have a few hours in the day when the team members time zones overlap. 

So on average I have found remote managers to be better managers. Although that may just be that the UK is renowned for bad management - both anecdotally and in terms of the UK's productivity ranking. So managers that are from more productive countries than the UK are likely to be from a better management culture - hence better managers.

Peer level managers are better

In a traditional hierarchical organisation. The CTO is the most senior manager and so remunerates people more who are managers like themselves. So particularly in the public sector there is an unwritten rule that even the most junior manager must be paid more than the most senior engineer.

This approach naturally leads to some rancour amongst senior technical staff who want to stay technical. It also devalues technical skills. Since to increase their salary technical staff must eventually give up all their years of technical skills and somehow gain 10 years of skill in people management overnight. Of course this doesn't happen so instead you get very novice people managers with a lot of largely irrelevant technical skills, and perhaps a personality totally unsuited to enabling their team.

It is easy to filter out such an organisation. Just get on one of the company feedback sites, eg. Glassdoor, Fishbowl etc. Check what the top IC engineers salary range is and check that it is higher than the junior management range. Ideally you should expect the top IC grade, eg. senior technical architect to be paid around 50% more than an engineering manager. But take it as a red flag if there is no technical grade that is paid more than any management grade.

Because it means the organisation doesn't value your engineering skills, and you are likely to be managed by someone who doesn't value them either and may regard themselves as your superior. So why would you work there? Surely better to work for a manager who treats you as a peer in an organisation that values your skillset.

Most people quit their job because they have a bad manager

The surveys tell us 43% leave their job because of a bad manager. With the next most important reason being general toxic culture / under appreciation. Whilst pay and progression comes in third.
We all want to get the number 1 manager, but unfortunately we often end up with 10.
So most people leave their jobs because of getting one of the bottom ranked manager types.

HR's job is to minimize disruption to the company, they will tend to take the side of the more senior employee unless that employee's behaviour is clearly proven to be detrimental on the wider scale to the organisation.

A few junior employees reporting them for incompetence or abusive behaviour, does not usually qualify.
So if you do complain then it is unlikely to improve the situation, and may well make it worse if the manager is informed that you complained about them. Sadly I have not personally heard of any case where complaints about bad management resulted in resolving a problem, but definitely have heard of cases where it made it a lot worse for the remaining time before the complainant's notice period was done.

So unless a more senior member of your organisation is on your side, and decides to deal with the problem manager. It is best to leave your job if you have a bad manager. For a big company that may just mean changing departments. But leaving your job is safest in terms of removing yourself from a toxic environment, plus you can honestly report the manager as being the reason for leaving in your exit survey. It is a chance for you to help your ex-employer, you shouldn't expect them to act upon it immediately. But given a sufficiently high attrition rate for a manager's team, the higher level managers should hopefully have enough sense and self interest to deal with their failing colleague.

Ten Software Managers

1. The Enabler 

The ideal technical manager is the enabler. They have better personal skills than software skills.
They are most likely to either have never been an engineer, ie they entered the industry as a professional technical manager. Or they were an engineer some years ago but are more interested in the people than the code so changed direction fairly early into their career.
They will be well aware of the wider environment of the organisation and the stakeholders and drivers for the technical work. A great communicator. With knowledge of all the systems and processes and so how to unblock things and get stuff done procedurally. Plus be a talented scrum master.
 
Most likely to say:  How should we fix that process for you?
Catchphrase: Thanks for your contribution 😊

2. The Big Cheese

You may find at some point in your career you get a manager who is actually a much more senior person in the organisation than their surface role as an engineering team manager. Maybe they founded the whole company, or they are the head of large department with other non-software engineering work.

To be high in the organisation they are likely to be a better manager than the average manager you might expect, with great people skills.

(Obviously this rule may not hold once you get to CEO level, I hope nobody who ended up with Elon Musk as their direct line manager is reading this ... although I don't think he was interested enough in software to directly manage software engineers, given he gave up coding at 20 before getting any formal training in it.)

They are likely to be more of a leader than a manager, but likely to be particularly good at fostering the development of their engineers. They are also going to have all the contacts to unblock any issue that may arise, plus have their finger on the pulse of high level changes and strategy that may impact your job.

They may still have a surprising high level of technical understanding of the company's software, but as a senior manager they also understand that their job is all about enabling an even higher level of understanding, and technical decision making about it, in their engineers.

On the down side, they probably don't actually have that much time to devote to you personally, so don't be surprised if they fail to act on things you have suggested to them. If there is any issue they make damn sure somebody in the team takes responsibility for sorting it out. So be prepared to be volunteered by them.

Most likely to say: How are you aiming to progress in the company?
Catchphrase: Isn't it amazing what we are building at (fill in organisation name)? 🏆

3. The Bro 

They were happily doing their engineering job when the manager for their team left. They were not necessarily the most technical member of the team, but they were the one who got on with everyone and the one that everyone in the team was happy enough to have as a manager. So they took the job.

They want to be you best friend and genuinely aim to protect you as a member of their team from problems or issues that are coming down from higher up the management hierarchy.

They are reasonably technically aware and skilled, but don't try to make the technical decisions or deal with issues unless nobody in the team steps up to the plate, then they take the task on themselves.

They are just one of the gang, but your manager too.

Most likely to say: Let me buy you a beer ... umm, sorry, everyone in the team has been made redundant.
Catchphrase: Yeah, what the hell are management up to 🍻

4. The Middle Manager

They used to be a techie, many years ago, but they weren't really interested in tech and hence were probably pretty mediocre at their job. But thankfully they got on the management track, result! They love the politics and intrigue of management way more than technical details. They have good people skills but find it hard to hide the fact they have absolutely no emotional investment or interest in technology.

They literally couldn't care less if the head of the organisation declared that from this day forth all software in the company will be written in Cobol and all open source was banned from use. If that is what their boss says, then their job is to listen to their team moan and complain about it. Then tell them that is what they will be using. Since any technical objections the team gives are meaningless to them.

On the plus side they do not micro-manage and they appreciate their team members skills, and are good at bringing those skills out.

The middle manager likes people and wants to please them. But knows that their job is only to please their superiors. From their team they just need compliance, and being good at their job is all about bringing their team onside, with whatever the higher ups require. However outlandish.

Most likely to say: Sorry it wasn't my choice, but come on, we need to get on with it.
Catchphrase: I raised your issue with senior management, but no go. 😢

5. The Invisible Man

The invisible man works in a big organisation and knows the value of his super power. He used to do a bit of useful work, but that was years ago, when he still took enough interest in his job to get to the bottom rung of the management ladder. But over the years he realised he could get away with doing less and less actual work. He mastered quiet quitting years ago, before it became a thing. 

Since then he (and his boss) has worked out that if he gets a team of reasonably senior self starter engineers, then they don't actually like or want to be managed. So a difficult opinionated team for some managers, are actually perfect for the invisible man. Ideally if they are a distributed team, then he can "manage" them remotely, with the minimum level of work. Send the odd email maybe, do one or two Zoom calls a week and his work is done.

His team may not respect him, they may even play jokes on him. But he so doesn't give a crap about work that he won't even notice them doing it. As a manager the invisible man is the mid-point of the top 10.
He is neither a good or bad manager, he is like having no manager at all. He will never support, challenge or rebuke you. At least nobody every quits their job because they have the invisible man as their manager.

Most likely to say: Sorry just had to step out for a minute.
Catchphrase: Keep up the good work guys. 👻

6. The Over Employed

The over employed is a people pleaser. They like to say yes to everyone, including their managers and their team. They like to be seen dashing about doing things. So sure they will do that for you tomorrow ... but tomorrow never comes! Because the over employed is too busy. They may even have got themselves a second job on the sly, thinking they can juggle both at once. They are such an optimist, of course it will work out. 

They carry on saying yes to all those tasks that you need them to get done, to unblock your work. Just as they do to everybody else. So sure, they will sort out your performance review. Talk to the other manager that you need info from. But somehow they never seem able to deliver on time, if at all.

They will be there at 3am the night before a major deadline, chucking something together that is not quite finished and misses some vital component. But it is good enough, should do what is needed.

Poor people pleaser working their arse off to please people, so why is nobody that pleased with them?
But they stay cheerful,  not going to let those moaners drag them down.
Oh well if colleagues get too annoying they can always bail out. Get a job somewhere else and leave behind all those trivial little tasks that people kept bugging them with.

Most likely to say: Yes sure, I will do that.
Catchphrase: Hiya guys, why the long faces? 🤯


7. The Team Player

For the engineers in their team the Team Player is one of the best bosses they have had. They always have their back, supports them and persuades those above him to funnel more resources and authority to their team.

They are ambitious and aiming to rise higher up the management tree, but loyal to their guys. They know their team is really the only one in engineering that is run properly. It is also the one doing the work that matters. They makes sure they sweet talks those above them and dedicates a reasonable portion of their time to making sure they know that they and their team are the keystone keeping the organisation running.

They know that anybody looking to advance their career should be spending a good portion of each working day working on their own personal advancement. Don't be the fool who spends all day every day just doing the organisation's work.

Unfortunately their highly competitive nature and self belief, can lead to self delusion. They start to believe their own self promotional narrative. This leads them to be contemptuous of those annoying flakes in other teams who are not doing anything of real value. Though generally energetic and positive those outside their circle and below their grade, get the aggressive, bullying, dismissive and unpleasant side.

This behaviour also distorts the true importance and funding that the organisation should be devoting to their team's remit, to the detriment of other areas. So can cause problems for the company as a whole, as well as for morale outside of their team.

Most likely to say: My team has got that, we will save the day.
Catchphrase: Get out of my way, unlike you, I have a real job to do. 💪


8. The Bureaucrat

Once upon a time, long long ago. Some software companies decided they wanted to sell their wares to the ancient hierarchical institutions of government and learning. Those institutions believed in traditions and rules and processes and making things quantifiable. So something that did that for software management was perfect. So the companies came up with traditional names signifying regal wealth and power - Prince. Along with naming their software, Enterprise software, signifying new, wealth generating and boldly taking the initiative. That was in the 1980s.

It was the perfect sales pitch for these outdated institutions and they bought into it wholesale. Although it took them about 25 years to get around to it, old institutions are like that.

So their procurement processes and software and its life cycle and management were bound into reams and reams of bureaucratic processes. The IT managers in those institutions were groomed in the ways of Prince 2 and ERP and ITIL and all the rest of the snake oil the companies had invented. They devoted all their money and time to the training and meetings and processes around it. 

As far as the engineers in those institutions were concerned, a few of the processes were useful but that was far out weighed by the whole bureaucratic burden and costs they were wrapped in.

The manager spoke at length for years at far too many meetings of the process and the newly procured systems, but unfortunately the quality and features of the institutions systems seemed to have gotten worse. Whilst the cost of them became much much greater.
The institution employed more and more managers, although they just managed projects not people. But eventually there was so many, they needed managers for them too.
But they hired no more engineers.
Eventually the institution decided it didn't really need any in house software engineers at all, why were they writing software when they should be buying proper Enterprise software from the companies?

So the engineer realised they had to leave the kingdom of the Prince 2 and its manager and go to live in a different place altogether, where their business was making software. Strangely in the software republic they had never even heard of Prince 2. They vaguely knew of PMP, but nobody in their right mind would use it to make or manage software.  

Most likely to say: I am sorry I cannot talk to you, until you have filed a change request.
Catchphrase: Our KPIs show that we are on course for all our CSFs 👑


9. The Technical Superior

The technical superior is at least a grade or two above you and always interacts with you as your superior.
They were an engineer and still secretly preferred being an engineer to their current job. They were never the best technical engineer in a team, so they compensated for that by imagining they were the best at seeing the big picture engineering wise and still are. So they decided to become a manager to make sure the right technical decisions are made.
 
They probably preferred their old job because they didn't have to spend so much time on politics and relating to people. They have been a full time manager for at least 5 years and their technical knowledge and judgement have dated badly. However they still see themselves as the person most qualified to make technical decisions. The more senior they become, the more out of date their technical knowledge becomes, yet the bigger and more expensive are the technical choices they make for their organisation. 

Most likely to say: Never ask a bunch of developers to decide on technology, ask 5 and you get 5 different answers.
Catchphrase: We really need a big monolithic Java XML SOAP web service to do that. 🙈


10. The Rockstar Techie

The very worst engineering manager is the rockstar techie.

Rockstar techies have great technical skills and may have technically saved the day a few times for senior management. So their lack of personal skills are overlooked, and they tend to behave better to people above them in the management hierarchy anyway.

But in the long term they are damaging to the quality of your codebase, the more senior they become the more damage they do. So the common technical issues they can cause are blocking devolution of architectural decisions and diverse input into them. Possessiveness over code, or technical knowledge. Wanting to be the saviour for technical problems, outages etc. 

However the damage they do to the code is minor compared to the huge damage they do to the engineering team and culture if they are put in a management position.

They were often the most senior technical engineer in their part of the company. They have probably been there a while and to justify getting another pay rise they got lumped with doing management too. They regard management as an annoying burden tacked on to the side of their real engineering job. They aim to remain the lead engineer in the team and make all the final technical decisions. They cannot devolve technical authority and have no interest in picking up any management skills. So are most likely to exhibit basic level fails in terms of interpersonal skills, have a technical superiority complex, be rude, moody, bullying, micro manage etc.
 
As time goes on they must devote more and more time to management and yet cannot accept no longer being the most technically adept guy in the room. A paradox that can only be solved by driving away any members of staff who challenge them technically and effectively dumbing down the technical skills of the whole team.

Most like to say: You Fffing broke it you moron, when coming across a feature change implemented in a way they didn't expect.
Catchphrase: You are wrong, this is how it must be done, idiot 👹


(Any resemblance to persons living or dead is purely coincidental)

Friday, 27 January 2023

Tech sector lay offs

INTRO: Having failed to post any technical articles for a few years I feel that my blog is at risk of dying from lack of attention. So to avoid that,  I have decided to mix up its content a bit and diversify from long Technical HOWTOs to more casual short posts whose tech content may vary (or not exist at all) ... so to kick off this one is a short rant about tech news!

Fake News about Tech Industry Collapse

A number of bigger tech companies are laying off staff at the moment. The press reports this as a related to them making terrible errors in misreading trends post pandemic and suddenly hiring way more staff than they normally would over the last year, 2022. Now reality has hit and the tech sector is awash with newly redundant workers as big tech desperately tightens its belt to survive. But is there any truth in either the premise of this argument or its conclusions?

A random recent example that repeats these ubiquitous assumptions ... https://fortune.com/2023/01/23/big-tech-layoffs-15-20-percent-next-six-months-top-analyst-says/ ... Citing the not for profit basket case, Twitter, being turned into a uniquely loss making zombie company by Elon Musk as though Google, Amazon and Microsoft had something in common with it as a business!

If you look at the graph of the employee count of these companies year on year. Then only Microsoft hired more than usual last year, 2022, and Amazon to cope with its pandemic boom did so in 2020. Google last had an uncharacteristic hiring spree in 2012. Similarly Facebook and other big tech companies growth curve exactly followed that of the last decade in recent years. There was no extra hiring.

So why the layoffs. Nothing to do with over hiring. Simple - look at the share price curves instead.
The market is valuing most big tech lower and recession looms - they need to chop staff to chop costs and make their finances look better to reduce that drop for their shareholders - the biggest of whom are the CEOs of those companies.

The big companies are still making billions in profit (they are not loss makers) so over the long term it would cost them less to retain talent, and they can afford to. However the CEO's personal short term loss in wealth is something that they can't stomach, and it is a good excuse for a clear out.


Obviously there are large loss makers, the most prominent of which is Twitter, but they are special cases with failing business models - Twitter only ever made a profit in the run up to Trump's election as it became a huge engine churning up ideological conflict with political and conspiracy fictions. Without a politically polarised USA to drive an explosion of lies and social media wars, it was always a loss maker. It has nothing to do with big tech trends. 


Apple is a real exception to the trend, currently, but only because its share price hasn't dropped significantly yet. Hence it hasn't done its layoffs yet.

So the real reason big profitable tech is laying people off is a temporary fix to save a few billions from the bear market's current swipe at the personal wealth of their CEOs. Even though being a slave to short term market driven fire / rehire cycles will cost the company more in the long term. It is purely a personal choice to save personal wealth, there was no over hiring, there is no need for redundancy, there is no down turn in the growth demand for the tech sector, there is just less cheap loans around to fund it.

 
The jump in the cost of borrowing is due to what looks to be a short term hike in interest rates, nothing to do with the tech sector itself. Largely due to Putin starting a war of revenge against his long dead ghosts from 35 years ago, when the Soviet Union fell as he presided over the KGB. A war against ghosts can never be won ... but unfortunately its far more terrible human cost will carry on for years.


This becomes clear on the other side of the market coin, late stage startups are often making no profit at all - because all profits along with loans are ploughed into growth. Because they must never be seen to be shrinking - to keep growing their valuation for IPO.


So a lot of software companies are hiring even though strictly they don't need to, whilst the big boys are firing when they don't need to.


Meanwhile the number of software jobs as a whole keeps growing at 10% year on year. The demand continues to outstrip supply and wage inflation follows. So if you are a talented (if overpaid) software engineer ... I wouldn't worry too much about the layoffs. It is just a chance to take a redundancy bonus and a 20% pay rise to try something new. Unfortunately for those that were used as a disposable foreign human resource by big tech, via job dependent visas, it is a different story. They may well not have time to stop their CEO's thoughtless greed needlessly disrupting their lives.

Many predictions are that interest rates will drop and the bear market end in a year's time. At which point the mass layoffs will be reversed. But the big tech companies will have lost a lot of money, and a great deal more trust, by following the market and each other so closely. The lesson employees will have learnt is that loyalty to such companies will never be rewarded or returned.


Thursday, 13 February 2020

K8s Golang testing - Mocks, Fakes & Emulators

A lot of the Go code I write is developed against Google's Kubernetes API.
The API is fairly large and given that the code is mostly calling K8s then it inherently has a set of complex dependencies, these dependencies have time and costs associated to run up for real as K8s clusters in cloud providers data centres.

So how can we test K8s Go code ... or any Go code with significant dependencies. We must use substitute objects that simulate the dependency objects. There are three common terms for these substitutes known collectively as test doubles. They are mocks, stubs and fakes. Unfortunately these terms are all pretty interchangeable. So before I start bandying them about, I had better define what I mean by these and related terms for this blog post ...

StubA function, method or routine that doesn't do anything other than provide a shim interface. If a stub returns values then they are dummy values (possibly dependent on calling args, or call sequence) that are either fixed or generated for a fixed range.

Mock = An object which replicates all or part of the interface of another object, using stub methods.

Fake = An object that replicates all or part of the interface of another object, and has methods which are not all stubs, so some methods perform actions that simulate the actions of the real object's method.

Emulator (full fake) = A package that has a significant amount of faked methods (rather than stubbed ones). For example a database server will normally provide an in memory database configuration that will completely replicate the core functionality of the database but not persist anything after the test suite is torn down.
Normally an emulator is not part of the code base and may require service setup and teardown via the test harness. As such, use of emulators tends to be for integration tests rather than unit tests.

Spy = A stub, mock or fake that records any calling arguments made to it. This allows for subsequent test assertions to be made about the recorded calls.

K8s Go Unit Testing

Given that unit testing by definition should not need any dependencies, then the assumption might be that for dependency heavy code, most unit tests will require test doubles ... the follow on assumption is that double == mock.

Mocks

Hence a standard approach to this is to use one of the Go's many generic mocking libraries, such as https://github.com/vektra/mockery,  https://github.com/stretchr/testify or  https://github.com/golang/mock.

There are numerous tutorials and explanations available to get Gophers started with them, for example this walk through of testify and mock.

These mock tools all offer test generation from introspection of your API calls to the dependency, to reduce the maintenance overhead.

So for everything we write tests for we generate mocks that reflect our codes API and confirm that it works as we expect. However there is a problem with this, the problem is described in detail in this blog post by Seth Ammons or in summary by Testing on the Toilet

The issues with mocks are:
  1. Mocking your code's calls to an API only models your usage and assumptions about it, it doesn't model the dependency directly. It makes your tests more brittle and subject to change when you update the code.
  2. Mocks have no knowledge of the dependencies they are mocking so for example as Google's APIs change - your real code will fail, but your mocked tests will still pass.
  3. Mocks may use call sequenced response values, so making them procedurally fragile, ie changing the order of your test calls can break mocked tests.
  4. If you want to swap out one library with another for a component, then because your mocks are specifically validating that libraries API, your mocks of it will need to be regenerated or rewritten.
So what is the alternative... 


Fakes

Refactor your code to be testable by using interfaces.

Break things down into simpler interfaces and create fakes that implement the minimum methods for testing purposes. Those methods should perform some of the business logic in a simulated manner for them to better test the code and relation between method calls than pure stubs would. Your model of the dependency is direct rather than based just on your calls of its API, so arguably easier to debug when that model and the dependency (and your evolving uses of it) diverge. 

Use ready made Fakes

But back to K8s and Google APIs ... in some cases the Google component libraries already have fakes as part of the library. For example pubsub has pstest. So you can just add the methods required so that things work for your test. In which case faking can be simple ...


The client-go library has almost 20 fakes covering most of its components but the only other fakes already in the K8s go libs (that I could find!) are for pubsub and helm

cloud.google.com/go/pubsub/pstest/fake.go
k8s.io/helm/pkg/helm/fake.go
k8s.io/client-go/tools/record/fake.go
k8s.io/client-go/discovery/fake
k8s.io/client-go/kubernetes/typed/core/v1/fake
... etc

However there are also third party libs for fakes such as 
https://github.com/fsouza/fake-gcs-server

Use custom built Fakes
If there is not an existing fake or it doesn't fake what you need, then for Google libs the APIs you need to replicate are not simple and you may want to simulate a number of methods for your tests. So manually creating the fake and maintaining its API against the real Google API becomes too much work, compared with autogenerating mocks. 

Google have sensibly anticipated this and hence released
google-cloud-go-testing
This package provides the full set of interfaces of the Google Cloud Client Libraries for Go
 there is no need to generate mock partial interfaces or maintain your own fake versions of its APIs.

As an example it can be used to create a fake GCS service, where data is just written to memory (in the global bucketStore variable)

The test substitutes the FakeClient for the real storage client. In order for the code to accept the real or fake client as the same type the library provides an AdaptClient method so both conform to the storage interface (stiface).
c, err := storage.NewClient(ctx, option.WithCredentialsFile(apiCredsFilename))
client = stiface.AdaptClient(c)


K8s Go integration Testing

For integration tests you ideally want to use the real dependencies, but if they are too slow or costly then they may well be best replaced by emulators.

Using gcloud emulators

Google also provides a number of full emulators to cater for speedy local integration testing,
https://cloud.google.com/sdk/gcloud/reference/beta/emulators which cover bigtable, datastore, firestore and pubsub.
So as part of your integration test setup you can can fire up the datastore emulator for example 


> export DATASTORE_EMULATOR_HOST=localhost:17067
> gcloud beta emulators datastore start --no-store-on-disk --consistency=1.0 
                                        --host-port localhost:17067 --project=my-project
The datastore client can then just be connected to the emulator for testing
client, err := datastore.NewClient(context.Background(), "my-project")
Using EnvTest and a local K8s API server

The EnvTest package creates a Kubernetes test environment that will start / stop the K8s control plane and install extension APIs. The K8s API server (and its etcd store) is by default a local emulator service (although it can also be pointed to a real K8s deployment for testing if desired).

EnvTest is wrapped up as part of Kubebuilder which is the primary SDK for rapidly building and publishing K8s APIs in Go. 

EnvTest caters for testing complex Kubernetes API calls of the type that might be required for testing a K8s operator for example. Hence when generating code for building an operator,  kubebuilder uses the controller-runtime in its boilerplate for running this up for a template integration test.



Summary

So if your K8s Go code is tested only by mocks for unit tests and running up a real Kubernetes cluster for integration tests, then maybe its time to re-evaluate your testing approach and start using the tools for fakes and emulators that are available. The only issue is that they are quite numerous with a mix of sources, so picking the right mix of Google internal lib, Google or 3rd party test package or custom built fakes and emulators becomes part of the task.

Sunday, 1 September 2019

Teaching an old Pythonista new Gopher tricks

I recently got a new job where I need to write a lot of Golang, so needed to learn it.
I figured that you don't really learn a language unless you try and write code that actually does something useful. However having been to a recent Golang meetup where someone had come to a similar conclusion, and had written a full emulator of the Gameboy in Go - I also figured I wanted to do something that was not quite so complex or low level ... ie hopefully, could be done in a week.

So I decided to take the plunge by creating an open source package that does the same job, as a Python one that I released many years ago called django-csvimport. A simple add-on for the Django ORM that caters for loading data to models from CSV files, with the option to generate the model code from scratch for a CSV file by checking the data fields and determining the data type for each column.

Also doing a task where I had solved the problems in another language would mean I could just focus on how Golang might approach the problem, not the problem itself. So this post is about the practical differences between writing a Python and Golang solution. As such it compares the languages as tools for a certain job, which I hope is complementary to the many posts that compare the languages themselves. Suffice is to say, they differ in many ways ... most significantly in static vs. dynamic typing ... whilst being most similar in regarding readable consistent simple syntax as paramount - where other languages have different priorities - hence for both auto-formatting code is good practise, with Go's builtin go format doing the job of Python's black or yapf.

So firstly Django is one of the leading full web frameworks for Python, so what is the equivalent for Go? Gorilla, Gin, Buffalo etc. there are plenty of frameworks but which is the leading one with an ORM? ... I tried out a couple but reading around it, it became apparent that if you choose to develop a web app in Go, then the majority of devs don't use a framework at all!, so already the differences in the languages was becoming apparent. Reasons? If you choose Go for creating a web app then performance may be a significant requirement, even micro frameworks can be slower than raw code. Go is a recent language and as such has lots of web related features built into the core already ...  templating, etc. and even imports are url based so a web framework in Go gives you less than it does in Python.
So instead I checked out Go ORMs and decided to write an extension package for Gorm as one of the leading Go ORMs.

So ditching the Web Framework / UI integration features of django-csvimport as an unnecessary extra, then the problem just consists of two parts, creating ORM model definitions that create relational database tables and parsing the CSV files to import the data to those tables.

From this high level spec. the core functional components that compose the tool that we want to rebuild in Golang are:


  1. CLI interface to take arguments specifying source files and actions to perform
  2. An ORM to manage vendor independent database schema creation and population
  3. Utility to inspect data sources and determine data types
  4. Template tool to create ORM models (metaprogramming)
  5. CSV parser to read in CSV files - ideally capable of handling various formats and poor or inconsistent formatting - ie real CSV files!
 For all of these we would hope for language level packages are available to do the major lifting. Then the package can just knit them together into a CSV to relational database import utility.

So stepping through these and rating Go vs Python...

CLI framework (draw)

As a minimum, our task requires a command line utility to point to the CSV data files to be imported.
Django comes with a CLI framework in the form of management commands. For our Go CSV import, gormcsv, we just have the ORM so we could roll our own CLI handling, but in this case that is probably not a great idea, since like Python, Go has a dominant CLI framework - Cobra equates to Python's Click. It uses the Viper config framework which is like Python's core configparser lib with extras. Within the gormcsv module I created these CLI command go files as a cmd package via Cobra's autogenerate feature and used them to wrap the importcsv.go and inspectcsv.go source files in the importcsv package that do the real work.

ORM (draw)

Any language's leading ORM's should cope with the database management and data population tasks and GORM is functionally similar in its capabilities to the Django ORM


Data source introspection tool  (Python win)

Messytables is a mature package designed for the task of scraping in data from various heterogenous third party sources - possibly of poor quality. As such it is one of the many utilities created around  Python's well established role in the data analytics realm. Go has no such tool. There is no third party package to cater for inspecting, type checking and cleaning up data sources :-(
So we have to make our own much simpler data inspector that will hopefully cope Ok with the most common data types if they are reasonably consistently formatted.

Templating tool for creating models (Go win)

For GORM and Django the ORM models are implemented directly as classes in the language rather than using an intermediate DSL or XML etc. So to create models based on introspecting source data metaprogramming must be used to generate code.
Templates are available in the core of Go. Also given it is statically typed and has no generics, then for some problems that generics would solve, the best alternative is to use metaprogramming. Hence templated generation of Go code is a normal Go pattern. So arguably this is better (core) supported in Go than Python. For Python code generation is rarely needed, and my original django-csvimport implementation just uses string construction and didn't even employ one of Python's many add on template packages, eg. Django or Jinja2 templates (hmm needs a rewrite!)
Note that both languages have fully functional reflection / introspection libraries in the core.

CSV Parser (Python win)

Most important to this application is the quality of the CSV parser. This is where Go is sadly completely let down. Its CSV parser is frankly inadequate and can only cope with CSV that is strictly formatted according to RFC 4180.

To quote from Python's csv parser library ...

CSV format was used for many years prior to attempts to describe the format in a standardized way in RFC 4180. The lack of a well-defined standard means that subtle differences often exist in the data produced and consumed by different applications. These differences can make it annoying to process CSV files from multiple sources.

TBH Python 3's CSV parser is itself significantly more strict about format than the old Python 2 one and so certain CSV files cannot be parsed that Python 2 happily dealt with - largely due to the switch to unicode resulting in more character encoding related critical fails.  However the Go parser is a whole other level of strict and realistically it can probably handle less than 10% of the real world CSV source files out there that you might want to scrape data from, into a database. Whilst Python 3's can probably cope with over 80%

I also investigated third party Go librarys that cater for parsing a more realistic range of CSV formatting, but found none that did so.

Conclusion

So in conclusion, Python may not be a Gopher Snake but for this task it does rather eat Go for breakfast. There is no ready made third party package to deal with ingesting unknown or badly formatted data like Python's aptly named messytables. Golang may sometimes be used for writing performant concurrent data processing in data science ... but it isn't used for the scraping and cleaning data sources part of the job! However this is a minor issue compared to the major blocker of not having an existing library that can import real world (ie sloppy format) CSV files.

So I have written my Go package for pushing CSV files to databases, gormcsv, and due to Go's great concurrency features it could certainly beat django-csvimport hands down in speed terms where big data quantities of CSV sources need ingesting. But I have yet to release it.  Because with such poor compatibility with real CSV files, there doesn't seem to be much point - however I will hopefully persist in finishing things off, probably as a less performant work around to pre-clean CSV files into strict RFC 4180 prior to parsing. Since implementing my own CSV parser from scratch for Go would likely break my original goal's of coming up with an open source project in the language that would take no longer than a week!

Oh and what do I think of Go? Well I like it, I most like the concept of classes just being data structs with bags of composed methods loosely coupled to them. I least like the error handling unseparated from normal code flow ... since it can lead to poor readability of code due to the excessive error boilerplate stuck within the program flow. It is my new favourite (statically typed) language ... but it hasn't replaced Python as my overall favourite.