I recently got involved in a discussion on a forum about first steps in learning to program, and what it then takes to become a professional software developer. By "professional" I mean people who need to do software development as part of their job role (and they get paid for it!).
That does not necessarily mean that "programmer" or "software engineer" or "web designer" or "IT specialist" is part of their job title. For every person in that group there is probably someone else doing just as much software construction who is simply called "engineer" or "physicist" or "data analyst". They produce systems that may be just as complex and just as critical as those produced by employees with the formal title. (I worked for many years on large and complex systems that helped to ensure the safety of nuclear power stations. I never had anything to do with software, programming or IT in my job title.)
It is quite likely that many people doing a lot of software development will have primary qualifications in other subjects. (It is a common career path for physicists, mathematicians and engineers. I am a physicist and astronomer by training, but my experience in software construction was sufficient to justify me applying for and being awarded Chartered Engineer status by the British Computer Society.)
In a nutshell my advice was divided into two parts:
- The skills required to build high quality software involve several levels of abstraction. You have to start at the bottom, of course, but the real professionals work at developing the higher level skills.
- There is a lot more to building high quality software than programming. Much of my time, even when I am working on one-man projects, would be spent on doing things other than programming - 75% would not be at all unusual. This is the activity that ensures quality in the final product.
Note that the primary difference between the professional developer and the amateur is the need to produce the agreed product at an acceptable level of quality, within an agreed timescale and budget. (There is, in fact, quite a lot of open source around - much of it done for the love of the game and perhaps as a public demonstration of expertise and creativity. Much of it is of very high quality. However, if you want to make a living from software you have to play by the professional rules.)
Programming Skills
Learning the syntax of a language is only one part of programming - but is obviously an essential first step. You need the language to express your ideas - but you have to have the right ideas to make it worth doing the expression. The ideas underlying programming are often called Computational Thinking (and as such are now taught at primary school level). There are two parts to programming (and novices typically initially focus on the first part, while experts focus on the second);
- Describing an algorithm for transforming input data to output data.
- Structuring information within the software to that the transformations are as understandable as possible to the programmer (and anyone who comes along later to do maintenance). I have discussed this in more detail here.
I have had many years of building software and also trying to maintain software written by others. Much of my experience can be summed up in the following gold nuggets:
- Software that is easily understandable to the person looking at the code is more likely to be free from error.
- If the programmer has designed good well documented information structures, it is relatively easy to see what the algorithms must be doing. One can often do without any further description of the algorithms.
- In contrast, if all you can see in the code is the algorithm, it is very difficult to deduce the fundamental information structures.
- For the program to make sense you need to understand the semantics of the information as well understanding the algorithm.
As a programmer, you need to understand what you are doing, but as a professional you also need to ensure that other people can understand what you have done. In fact, unless you can make it clear to me how your software works, I will normally conclude that you probably do not really understand how it works. (You may be convinced that you do, but subsequent events usually prove me right.)
Essentially all advice and guidance on methods of building programs is really about making what you are doing understandable with as little effort as possible. If you understand the code you can see whether it is right or wrong. If you are not sure you understand it, there are almost certainly latent errors lurking to catch you out at some future time.
Let me pick this advice apart in a bit more detail.
Choosing a language to learn: in a professional environment you probably will not have much choice about which language you use. That will be decided by more senior project leaders. Your turn for leading will come. However, to some extent it does not really matter which language you learn first. At the level of complexity at which you will be working in learning exercises, they can all express the necessary ideas - though some may have steeper initial learning curves than others. (My personal prejudices are expressed here.) Most of the art on this website has been constructed in a Java-like language (a sub-set) but all other things being equal, Python has an easier initial learning curve and is now quite commercially useful. I prefer it to Java for handling scientific data, and have successfully use it for processing millions of data items at a time. (I might switch if I had to deal with some of the humungously large datasets now common in high energy physics!)
For real commercial applications the choices are more constrained by the technical requirements of the working environment. Web systems development, for example, needs methods of interacting with a Web browser or Web server, but you would not choose Javascript for solving differential equations.
Having learned, however, to program well in one language it is usually relatively easy to transfer the higher level skills to another language (a few weeks of effort at most to reach equivalent expertise). There are some caveats here: most people first learn an "imperative" language (e.g. Java or Python or C++) written as a series of commands that produce actions. There is a whole class of "declarative" languages (e.g. SQL and PROLOG) and languages where "recursive" algorithmic structures come to the fore (e.g. LISP and PROLOG). SQL is widely used in anything involving databases and is an essential skill for the broadly based system developer. LISP and PROLOG are widely used in Artificial Intelligence work: they do not appear frequently in job adverts but this may change. Furthermore, it is useful to learn these languages because they require mastery of a different set of programming concepts and you will learn to think in a different and more flexible way about certain types of problem.
Algorithms: all programs are an expression of an algorithm, and most of the really useful algorithms have already been invented and well documented (for example, in Knuth's classic and encyclopaedic "Art of Computer Programming"). Professionals build on the high quality work of others whenever possible, using pre-built libraries if available and researching known algorithms for your own implementation if you have to. (I was once delighted to find that an algorithm that was just right for my problem had been invented and documented by a 17th Century clergyman who needed to systematically list permutations for bell "change" ringing.) Languages such as Java, C++ and Python have very extensive libraries providing a wide range of commonly used algorithms.
Information Analysis: (see also this). The experienced developer working on the design of a large system will probably, these days, start with information analysis, particularly if it is likely that "Object Oriented" methods will be employed. (There is a relatively transparent and straightforward route from OO analysis -> OO design -> OO programming.)
Consider, for example, the student facing interface at a university. (I know about the Open University having completed my Masters in Computing there.) There is a concept of a student (who has a name, address and a unique identifier that never changes). Students are registered for specified degree paths, which are followed by taking modules (with a name and unique identifier) etc.. Modules have presentations, associated with TMAs and exam papers. Students connect to TMAs and exams via marks. Presentations connect to tutors assigned to teach them, and tutors connect to a group of students. The bold words are entities for which information must be stored, and we must also store the relationships between the the entities. Finally, we need to know what actions can take place to change the stored information in a fully consistent way, and how we can query information from the system. All this we do before any actual programming. If the correct entities (with there associated attributes) have been identified and the correct relationship and possible actions, then it will be straightforward to construct the system and it will also be straightforward to modify it (because nothing will need to be undone).
In contrast, we may consider the case of the Trustee Saving Bank which recently experienced severe customer service problems when it was taken over and the new owner wished to integrate its customer accounts into its own IT system. Now, I do not know exactly what went wrong here, but I have listened to insiders from other banks explaining the potential problems of attempting to integrate their own IT systems and we can have a good guess. In any case, what I am about the describe is a feasible scenario at other organisations. Many of the potential problems stem from early computerisation in which the wrong information structures were built into the systems. At one time you had just accounts that were accessed with bank books. This process was reproduced in the early "ledger" system, which had no concept of a customer as a unique individual. Different account opened by the same person at different times were unconnected (maybe you had moved house, married or just decided to use a middle initial instead of a middle name on the account). These deficiencies were partly corrected by imposing higher level layers on top of the original software, and further layers added when web interfaces were required. Over the years anomalies build up and come to light as inconsistencies preventing account access when one attempt to move millions of items of information to a new bank system automatically.
Data only becomes information when there is a semantic interpretation ("This string of characters is the name of the student"). The catch is that all the data has to have a consistent interpretation before it can be considered information. (So, problems could potentially arise if students names were stored in more than one place, perhaps in one place with a middle initial not present in the other place. Two items of data supposed to have the same interpretation, but actually they are different strings. )
Software Design Patterns: experienced programmers organise their system designs according to well-known design patterns (e.g. as described here). A relatively small number of such patterns can be combined in various ways to assemble a design for almost any software. We know these patterns work (just as we know that certain layouts of mechanics within cars work). We also know that other engineers are familiar with the same patterns so they are already way up the understanding curve. This is like taking the well worn motorway routes from A to B rather than the shortest route across country. You are less likely to get lost. A good professional programmer will have used a number of standard design patterns and have enough familiarity with the patterns literature to know when other patterns are available to fit the current programming problem.
Wider Software Construction Skills
When I worked in the nuclear industry I had some responsibility for experienced development of junior staff who needed to produce software as part of their work. (Note that for most their primary role was focussed on engineering or physics: the software construction was an essential part of what they did, but was not their fundamental professional motivation.) All of them had to go through a mentoring process in which they had to show that they had developed a number of skills. These included:
- A good knowledge of at least one programming language.
- An understanding of configuration control.
- An understanding of software life cycles and quality assurance.
- A demonstrated ability to construct at least one system of a complexity that actually requires the exercise of the full range of analysis, design and construction skills, including performing and documenting:
- Analysing the requirements for the new system.
- Specifying the desired behaviour of the system.
- Implementing the specification (including programming).
- Testing.
- Delivering a product of known configuration to the client.
- An understanding of the limitations of their knowledge and skills.
If you followed a degree course in Computer Science or Software Engineering you would cover very considerably more than this in your skills development. We were aiming to give physicists and engineers (who have very high level skills in their own field) just enough software engineering to do what they needed to do without making serious mistakes. It is important to know ones limitations. Here, we were constructing systems where the fundamental complexities were in the physics, maths and engineering aspects, not the software structures. Broader skills and more experience would be required when the software complexity comes to the fore.
Note that "programming" as such appears only in one part of this list. It is, of course, still an essential link in the chain without which all the rest falls apart. But it is only one link and all the other links are equally critical to producing high quality software.
Completing this experience development program provided a license to take responsibility for their own work (until then the mentor was held at least partly responsible for errors): a basic driving licence - but by any means a qualification to take on complex software design work. They were, however, in a position to take charge of their own further experience development.
Let's be honest: most of us in software construction really enjoy the programming bit. Unfortunately, that often leads less experienced developers to concentrate on this phase of construction, skimping the analysis/design phases, and getting away with as little testing and documentation as possible. 40 years of experience tells me that this is certainly a mistake. I know, from hard won experience (making all the mistakes!) that the fastest way to build good software is to spend more time thinking and less time programming. Less thinking means more errors that need to be corrected with rework, and probably the whole system gets to look like a pile of sticking plaster. It is not possible to build good software by trial and error: quality has to be built in from the start (from the moment someone says "I think we need a system to do something like this.")
(N.B. There are software development methodologies under the "Agile" or "Extreme Programming" headings that seem to the naive to justify this type of approach. It is not true. Agile methods are a disciplined approach that can be highly effective in some environments. They are not a free for all. Furthermore they are not suitable for all application areas - which I regret. I spend my career on systems which have relatively simple user interface, but highly complex and many layered algorithms. You have to have a very good idea where you are going before you start, and you have nothing to show for your efforts until just about everything is fully implemented and working correctly.)
How long does it take to become an expert developer?
That rather depends on the scope over which you intend to operate. If you confine your ambitions to an area such as small scale web applications development, one could become productive at a professional level with perhaps a year of concentrated work. If you are thinking you might work on safety critical systems, such as fly-by-wire aircraft control, then there are a great many difficult skills to master, sometimes, for example, including producing mathematical proofs of design correctness. Not everyone would have the pre-requisite knowledge and skills to follow this track. For people whose primary focus was a different technical skill (e.g. nuclear engineering) it typically took about two years for someone to complete an experience development programme in software development.
In my own experience, the technology changes with a timescale of about five years (that is to say, the basic software production methods must be completely relearned every five years or so). This feels about right for acquiring an adequate range of cutting edge hands-on system development skills.
The higher level skills are more transportable and with more experience you see how they can be applied in more and more contexts and on bigger scales. I thought I was an expert after 10 years of active work on building computational physics systems (some quite large scale). I learned how little I knew about fundamental theoretical underpinnings and broader applications when I undertook an Open University Masters degree in Computing. I still think that was becoming better at what I did until the day that I retired (after providing the technical lead for three years on a novel project involving a lot of parallel computation).