Ed Crewe: Software Development with Generative AI

Why write an update?

I wrote a blog post on Software Development with Generative AI last year, which was questioning the approach of the current AI software authoring assistants. I believe the bigger picture holds true that to fully utilize AI to write software, will require an entirely different approach. Changing the job of a software developer in a far more radical manner and perhaps making many of today's software languages redundant.

However I also raised the issue that I found the current generative AI helpers utility questionable for seasoned developers:

"The generative AI can help students and others who are learning to code in a computer language, but can it actually improve productivity for real, full time, developers who are fluent in that language?
I think that question is currently debatable... (but it is improving rapidly) ... We may reach that point within a year or two"

Well it hasn't been a year or two, just 6 months. But I believe the addition of the Chat window to CoPilot and an improvement in the accuracy of its models has already made a significant difference.

On balance I would now say that even a fluent programmer may get some benefits from its use. Given the speed of improvement it is likely that all commercial programming will use an AI assistant within a few years.

To delay the inevitable and not embed it in to your work process is like King Canute commanding the sea to retreat. There are increasing numbers of alternatives available too. However as the market leader I believe it is worth going in to slightly more depth as to the current state of play with CoPilot.

Github Copilot Features

The new Chat window within your IDE gives you a context sensitive version of Copilot ChatGPT that can act as a pair programmer and code reviewer for your work.

If you have enabled auto-complete then you instigate that usage by writing functional comments, ie prompts then tabbing out to accept the suggestions it responds with.

To override these prompts, you instead can use dot and get real code completion options (as long as your IDE is configured correctly). Since code completion has your whole codebase as context, it complements CoPilot reasonably well. But whilst the code completion is always correct, CoPilot is less so, probably more like 75% now compared to its initial release level of 50%

It takes some time to improve the quality of your prompting. An effort must be made to eradicate any nuance, assumption, implication or subtlety from your English. Precise mechanical instructions are what are required. However its language model will have learnt common usage. So if you ask it to sort out your variables it will understand that you mean replace all hardcoded values in the body of your code with a set of constants defined at the top, explain that is what it thinks you mean and give you the code that does that.

You can ask it anything about the usage of the language you are working in, how something should be coded, alternatives to that etc. So taking a pair programming approach and explaining what you are about to code and why to CoPilot chat as you go, can be very useful. Given rubber duck programming is useful, having an intelligent duck that can answer back ... is clearly more so.

It excels as a learning tool, largely replacing Googling and Stack Overflow with an IDE embedded search for learning new languages. But even for a language you know well, there can be details and nuances of usage you have overlooked or changes in syntactic standards with new releases you have missed.

You can also ask it to give your file a code review. Where it will list out a series of suggested refactors that it judges would improve it.

Copilot Limitations

Currently however there are many limitations, understanding them, helps you know how to use CoPilot and not turn it off in frustration at its failings!

The most important one is that CoPilot's context is extremely limited. There is no RAG enhancement yet, no learning from your usage. It may seem to improve with usage, but that is just you getting better at using it. It does not learn about you and your coding style as you might expect, given a dumb shopping site does that as standard.

It does not create a user context for you and populate it with your codebase. It simply grabs the content of the currently edited file and the Chat prompt text and the language version for the session as a big query. The same for the auto-suggestion. But here the chat text is from the comments or doc strings on the lines preceding.

Posting the lot to a fixed CoPilot LLM that is some months out of date. Although apparently it has weekly updates from continuous retraining.

This total lack of context can mean the only way you can get CoPilot to suggest what you actually want is to write very detailed prompts. It is often simpler to just cut and paste example code as comments into the file - please rewrite blah like this ... paste example. Since only if its in the file or latest Chat question will it get posted to inform the response.

At the time of writing CoPilot is due to at least retain and learn from Chat window history to extend its context a little. But currently it only knows about the currently open file and latest Chat message. Other providers have tools that do load the whole code base, for example Cody, plus there are open source tools to post more of your code base to ChatGPT or to an open source LLM.

As this blog post update indicates, the whole area is evolving at an extremely rapid pace.

The model it has for a language is fixed and dated. Less so for the core language but for example you may use a newer version of the leading 3rd party Postgres library that came out 2 years ago. But the majority of users are still on the previous one since it is still maintained. Their syntax differs. Copilot may only know the syntax for the old library because that is what it was trained with, even though a later version is being imported in the file, so is in Copilot's limited context. So any chat window or code prompts it suggests will be wrong.

I have yet to find it brings up anything useful that I didn't know about the code when using the code review feature, plus the suggestions can include things that are inapplicable or already applied. But I am sure it would be more useful for learning a new language.

AI prompting and commenting issue

Good practise for software teams around code commenting are that you should NOT stick in functional comments that just explain what the next few lines do. The team are developers and they can read the code as quickly for its base functionality. Adding lots of functional commenting makes things unclear by excessive verbosity.
It is something that is only done for teaching people how to code in example snippets. It has no place in production code.

Comments should be added to give wider context, caveats, assumptions etc. So commenting is all about explaining the Why, not the How.

Doc strings at the head of methods and packages can contain a summary of what the function does in terms of the codebase. So more functional in orientation, but as a big scale summary. So again they are a What not a How.

It looks like current AI assistants may mess that up. Since they need comments that are basically as close to pseudo code as possible. Adding information about real world issues, roadmap, wider codebase, integration with other services ... ie all the Why is likely to confuse them and degrade the auto-complete.

Unfortunately code comments are not AI prompts for generating code and vice versa.
Which suggests that you may want to write a temporary prompt as a comment to generate the code, then replace it with a proper comment once it has served its purpose.

Or otherwise introduce a separate form of hideable prompt marked comment that make it clear what is for the AI and what is for the Human!

Alternatively use the chat window for code generation then paste it in.

Copilot Translation

Translation is an area where Copilot can be very beneficial. As a non-native English speaker you can interact with it in your own language for prompting and comments and it will handle that and translate any comments in the file to English if asked to.

Code translation is more problematic, since the whole structure of a program and common libraries can be different. But if the code is doing some very encapsulated common process. For example just maths operations, or file operations. It can extract the comments and prompts and regenerate the code into another language for you.

One can imagine that one day the only language anyone will need will be a very high level, succinct English-like language, eg. Python.
When you want to write in a verbose or low-level language. You just write the simpler prompts in a spoken language, but use Python when it is faster to communicate explicitly than spoken. Since spoken languages are so unsuited to creating machine instructions.
Press a button and Copilot turns the lot into verbose C or Java code with English comments.

1 comment:

Ed13 June 2024 at 08:46
Note that this review of copilot is purely of Github Copilot. Not the Office one, Microsoft Copilot. Rather confusingly they are named the same. Also whilst Microsoft Copilot's premium version Copilot Pro, did do RAG with custom GPT Builder, it has now abandoned that option https://support.microsoft.com/en-gb/topic/gpt-builder-is-being-retired-d1de6c3a-4c7a-4bcd-98ff-2f65f3d23cd1
Meanwhile the premium version of Github Copilot is called Enterprise, and has never offered RAG. It does allow the addition of a knowledge base to provide company context, but currently still has the very limited context of just the file being edited.
Also alternatives such as Cody tend to just add the method signatures to the context - ie what auto-complete uses, rather than the code of the whole package or larger software context.

Ed Crewe

Pages

Sunday, 9 June 2024

Software Development with Generative AI - 2024 Update