I know there is a thread about what every developer should know but this is really a list of topics, such as xml, software licensing, various important distinctions (e.g. value/reference types), and many general topics (xml etc being important as they are used in code to construct xml documents etc).
Anyway, as I am working on some of those topics above, what algorithms should every dev know? I ask because many job interviews involve a short practical test which involves implementing a common algorithm (like a linked list, ok this isn't an algorithm but something every dev should know how to code). I already have a class library with some basic, common algorithms but I want to expand on it to include complex and rare algorithms not really well known in computer science.
It would be very welcomed if you could provide algorithms asked for in tests when you applied for jobs at dev shops or large companies like Microsoft/Google.
I hear a lot about A*, Dijkstra and the travelling salesman problem so I will learn about these. I understand the 8 basic sort algorithms (although probably can't remember all the subtle differences). Also, to compliment this activity, I am reading data structures and algorithms with C# (my main language I use to write code).
Any suggestions guys?
Besides knowing particular algorithms, it might also be useful to practice general algorithmic techniques like:
A nice place to apply these to different new problems are the TopCoder algorithm competitions [6] and practice rooms [7].
[1] http://en.wikipedia.org/wiki/Proof_by_exhaustionAt a bare minimum, every developer should know how to:
None of these are particular difficult for a competent developer, even if you've never implemented them before.
And while its not a requirement, I think its important for developers to know:
There are three categories of algorithms that you'll get a lot of use out of.
If you understand several of each of these you'll have the basics down. This might not be a popular idea, but I think that these are the only three that everyone must know. Anything beyond these three is going to depend on what area you're interested in. Game programmers are going to learn a lot of algorithms in graph theory, but a Web developer might not find much use for them, for one example.
I think, the ability to come up with different ways to solve a particular problem (i.e. building your own algoirithms, no matter how simple/trivial it may be) is a more important skill to have (and harder to develop).
Remember, you can't always know all the possible algorithms that may be asked in an interview. But, what you will be tested on is how efficiently you can think on your feet when given a complex problem. However, knowing and understanding a lot of algorithms can help in developing this skill.
Two I've used for interview purposes:
Yes, I realize these are very simple and we could all knock them off very quickly, but I'm not testing for the end product and I don't want them sitting down for two hours trying to nut out something complicated.
I want to see the process followed to arrive at a result (so I get them to white-board the entire thing, explaining as they go). It's more to see if they have problem solving ability. That's far more important than knowing the algorithms off the top of their head.
I really don't care if they can code up a bubble sort (it's a crappy sort algorithm for large data sets anyway and, if you need some sort code, you'll be using something built in to the language, pulled out of Wikipedia or found with Google, not writing your own - I'm not going to be happy paying you to re-invent the wheel).
The problem with asking someone to come up with a complicated algorithm is that they won't be doing that on the job. Most code is very simple stuff so you may as well just ask them to code up a for-loop. By all means ask them about algorithms and see what they know. For example, ask them how they would sort an array. I wouldn't hire anyone who coded up a bubble sort (or even a quick sort, for that matter). I would consider hiring them if they knew that different sort algorithms have different properties depending on the data set.
The best response in my view to a question like "Can you implement an array in Perl which uses a string as an index?" is not "Yes". It's either "Perl already has those, they're called associative arrays" or "No, but I know how to find code that's freely available to use on the net". Of course, if they come up with an O(n) or O(1) sort algorithm, I'll hire them on the spot :-).
As an aside, one guy answered the Tic Tac Toe question by transferring the board to the center of a 7-by-7 matrix (with the outer squares set to 0), then looping on the inner squares, checking for three of the same value in every direction to "simplify the algorithm" as he put it (not needing to worry about crossing over the boundaries of the array). Something like (and I kid you not, the amount of comments was the same as shown here):
int board[7][7];
for (col = 2; col < 5; col++) {
for (row = 2; row < 5; row++) {
board[row][col] = 0
}
}
board[2][2] = arr[0];
board[2][3] = arr[1];
board[2][4] = arr[2];
board[3][2] = arr[3];
board[3][3] = arr[4];
board[3][4] = arr[5];
board[4][2] = arr[6];
board[4][3] = arr[7];
board[4][4] = arr[8];
for (col = 2; col < 5; col++) {
for (row = 2; row < 5; row++) {
for (coldir = -1; coldir < 2; coldir++) {
for (rowdir = -1; rowdir < 2; rowdir++) {
if ((coldir != 0) || (rowdir != 0)) {
sum = 0;
for (i = 0; i < 3; i++) {
sum = sum + board[row+i*rowdir][col+i*coldir];
}
if (sum == -3) return -1;
if (sum == 3) return 1;
}
}
}
}
}
return 0;
It was an innovative approach but way too complicated compared to the 8 if-statements that most people came up with. Still I gave him a chance to explain his reasoning and, in his "Thanks but no thanks" letter, I let him know why he didn't get the job (this code was just one of the reasons) so hopefully, he's improved himself and is now gainfully employed.
And, before anyone asks, the 8 if statements are:
if ((arr[0] != 0) && (arr[0] == arr[1]) && (arr[0] == arr[2])) return arr[0];
if ((arr[3] != 0) && (arr[3] == arr[4]) && (arr[3] == arr[5])) return arr[3];
if ((arr[6] != 0) && (arr[6] == arr[7]) && (arr[6] == arr[8])) return arr[6];
if ((arr[0] != 0) && (arr[0] == arr[3]) && (arr[0] == arr[6])) return arr[0];
if ((arr[1] != 0) && (arr[1] == arr[4]) && (arr[1] == arr[7])) return arr[1];
if ((arr[2] != 0) && (arr[2] == arr[5]) && (arr[2] == arr[8])) return arr[2];
if ((arr[0] != 0) && (arr[0] == arr[4]) && (arr[0] == arr[8])) return arr[0];
if ((arr[6] != 0) && (arr[6] == arr[4]) && (arr[6] == arr[2])) return arr[6];
return 0;
But you knew that already, I'm sure.
To generalize it, we can also use this pseudo-code:
const SIZE=3
// cells_array is a sized SIZE*SIZE array
function cell(x,y)
return(cells_array(x+y*SIZE))
function get_winner(z)
result=0
if abs(z)=SIZE then result=sign(z)
return(result)
for i = 0 to SIZE-1
sumrow=0; sumcol=0; sumd1=0; sumd2=0
for j = 0 to SIZE-1
sumrow=sumrow+cell(i,j)
sumcol=sumcol+cell(j,i)
if (i==0) then // We only need diagonals calculated once!
sumd1=sumd1+cell(j+j)
sumd2=sumd2+cell(j+SIZE-j-1)
winner=0
if (winner==0) then winner=get_winner(sumrow)
if (winner==0) then winner=get_winner(sumcol)
if (winner==0) && (i==0) then winner=get_winner(sumd1)
if (winner==0) && (i==0) then winner=get_winner(sumd2)
for x in [[0,1,2],[3,4,5],[6,7,8],[0,3,6]...]: if b[x[0]] != 0 && b[x[0]] == b[x[1]] && b[x[1]] == b[x[2]]: return b[x[0]]; return 0
is much easier to write, read and modify. If you make a single mistake in a number in 8 conditional statements it means you have to read the whole thing again. - sdcvvc
Being able to implement basic search algorithms from CS101 like quicksort and the like isn't really all that important, every language has flexible, built-in implementations these days, and I can't remember the last time I had to call one directly. If I need something sorted, I'll store it in a sorted set.
The problem here is that you're not asking what algorithms you should know to make you a better developer you asking what algorithms you should know to make you a better interviewee. And there's no way to know that, because every interviewer will be different, and you can always have bad developers who manage to make it into management. Frankly, I think interviewers who ask CS101 style trivia questions are probably not that bright themselves.
Now, if you really want to be a better programmer, I think the best thing to do is have a broad and deep understanding of what the available libraries can do, first of all. So if I was looking for a java developer I'd want someone who could answer questions about the collections API and how to solve some simple problems by combining them in ways that let them look up information quickly.
I guess it depends on what you are looking to do. Very few developers actually get to implement the algorithms you use in your example in their daily lives. In truth, they're solved problems.
If you're one of the lucky few actually building fundamental libraries rather than the typical enterprise glue-code, then I would argue that an understanding of the approaches and patterns for algorithms would be a better topic of interest. Stuff like divide and conquer, dynamic programming, greedy, backtracking, genetic.
If you ever get to do something like that, you'll be in a very elite group.
you should know of the existence of several algorithms - hashing functions, hash tables, quicksort, merge-sort, depth-first and breadth-first graph traversal, neural networks, genetic algorithm, genetic programming, recursive descent parsing, etc.
you should reinvent none of them.
The obvious two:
You really should not forget the ability to analyze the algorithms you learned or designed yourself, preferably by using standard big O notation (try to master big Theta and friends as well).
For instance, you could be asked to compare the eight sort functions you learnt. Or to compare a well-known algorithm to something esoteric (which is most probably the one you just developed as an answer to a question that you were presented for the first time). If efficiency is of any essence (instead of, e.g., readability), then you should be able to analyze the algorithmic complexity of each option, and consequently decide on the asymptotically best choice. (In addition to that, you could of course mention that for a small N, something else would work faster, if that's the case.)
I think it's vital for a developer to recognize a bad algorithm.
And for that I really think you need to be familiar with
"Big O"
[1] notation.
I suggest asking them to code some very simple algorithm. Even a very simple algorithm weeds out the incompetent and allow the talented to shine. I've read about this on Joel On Software and actually when I tried it out in real life it worked excellently.
Ask them to code a bubble sort algorithm. Bubble sort is really simple, everyone should be able to do it in a couple of minutes, right? Well, apparently not. Only the fair/good ones and the very good ones will be able to do it, and even though it is just a few lines of code you will see that you will be able to just tell the difference in the skills of the applicants.
This way, you will save a lot of time that can be spent on something more useful, like selling your company to the best applicants.
EDIT: To put it other way, from the perspective of a job seeker, I would just concentrate on the most basic stuff you have learned at university on the Data Structures class, like sorting, handling binary trees, etc. Anything else is not really relevant on a job interview unless of course you are applying for a position in a very specific domain (e.g. if your job would be writing compilers, it doesn't hurt to know a little bit about lexers)
quoting from: Francis Sullivan, "The Joy of Algorithms," Computing in Science and Engineering, vol. 2, no. 1, pp. 2, Jan./Feb. 2000, doi:10.1109/MCSE.2000.10000, these are the "10 algorithms of the century" (this was written in 2000). Does not exactly answer your question, but certainly has some relevance.
the Monte Carlo method or Metropolis algorithm, devised by John von Neumann, Stanislaw Ulam, and Nicholas Metropolis;
the simplex method of linear programming, developed by George Dantzig;
the Krylov Subspace Iteration method, developed by Magnus Hestenes, Eduard Stiefel, and Cornelius Lanczos;
the Householder matrix decomposition, developed by Alston Householder;
the Fortran compiler, developed by a team lead by John Backus;
the QR algorithm for eigenvalue calculation, developed by J Francis;
the Quicksort algorithm, developed by Anthony Hoare;
the Fast Fourier Transform, developed by James Cooley and John Tukey;
the Integer Relation Detection Algorithm, developed by Helaman Ferguson and Rodney Forcade; (given N real values XI, is there a nontrivial set of integer coefficients AI so that sum ( 1 <= I <= N ) AI * XI = 0
the fast Multipole algorithm, developed by Leslie Greengard and Vladimir Rokhlin; (to calculate gravitational forces in an N-body problem normally requires N^2 calculations. The fast multipole method uses order N calculations, by approximating the effects of groups of distant particles using multipole expansions)
Steve Yegge [1] outlines several important programming-related things to know, including several algorithms and data structures. (Scroll down to "Tech Prep Tips" on the blog post.)
[1] http://steve-yegge.blogspot.com/2008/03/get-that-job-at-google.htmlI don't think it's necessarily that important for you to know any algorithms at all. I think it's more important to understand how to analyze algorithms and choose the best one for the given situation. After all, if programming were just about memorizing algorithms and implementing them, any idiot could do it.
Lexing and Parsing techniques have been invaluable to me. I realize these aren't algorithms, but knowing how to do this well is extremely useful.
I don't want to reiterate what has and will be said - sorting, trees, hash maps, etc.
Instead, I will give some practical advise. Know what sort of algorithms you will be using, and learn those in great detail.
For example, most languages will provide a sort algorithm, etc, but it is important that the developer understands how it is implemented within the language and when it is a good/bad time to use it.
Interviews I've had at Google tended to focus less on sophisticated algorithms and more on my ability to code simple algorithms cleanly, quickly, and well. I remember being asked such problems as reverse a string, reverse the words of a string (i.e. print the normal words but in the opposite order), and binary search. For all of these it really helps to know your loop invariants as described in David Gries's book The Science of Programming [1]. It's also very good to have a deep understanding of two or three recursive algorithms; it doesn't matter which ones. But you want to be able to recognize when recursion is called for.
The one hard algorithm I got asked was about an algorithm for the stable marriage problem, but we got a bit bollixed up because my interviewer knew the answer he was looking for, but he couldn't quite remember a precise statement of the problem.
If you have it in mind to interview, I suggest you spend at least as much time having a friend ask you to code algorithms at the board as you do learning new algorithms.
[1] http://rads.stackoverflow.com/amzn/click/0387964800There may be a large difference between what a developer should know and what they need to know. For example, most of us can get away knowing a very few data structures and the simplest (even non-optimal) algorithms that operate on those structures. Designers, framework developers, real-time application, OS, AI and game developers, to name just a few, may need to know substantially more. You question may really be aimed at the academic: what are the foundational algorithms of our trade? For the answer to this question, there's no better source than Donald Knuth's "The Art of Computer Programming" [1].
[1] http://www-cs-faculty.stanford.edu/~uno/taocp.htmlI would suggest studying some Graph Algorithms (shortest path, etc.) and Dynamic Programming Algorithms (closley related).
These are the type of algorithms you usually have to adapt to the specific case (optimization/search problems usually) rather than just call a library function.
There are many problems that are more easily solved if approached using graph theory.
While knowing algorithms is helpful it might also be helpful to have a surf around for the kinds of questions people ask in interviews. This way you can know what an employer will ask and, more importantly, the knowledge they want to see in you. Here are some i found from a quick search.
http://blogs.microsoft.co.il/blogs/sasha/archive/2008/04/02/tales-from-an-interview.aspx http://ayende.com/Blog/archive/2007/08/22/Interviewing-developers.aspx [1] http://www.codinghorror.com/blog/archives/000781.html Note the mention of the Fizz Buzz algorithm. I know of several people who have been asked that in an interview.
And of course a couple of SO answers
http://stackoverflow.com/questions/329289/really-wow-them-in-the-interview http://stackoverflow.com/questions/70763/good-c-interview-questions-for-a-senior-dev-position
Regards
Simon
[1] http://www.codinghorror.com/blog/archives/000781.htmlWorking through the various compression algorithms (deflate, LZW, BZip2, huffman, etc.) will force the learning of most of the standard algorithms.
Some of the good questions can also be found at http://coders-stop.blogspot.com
Work through Project Euler, completing as far as you can. Create your own primality tests, factoring algorithms, etc. with the help of Google/Wiki/Mathworld/Library for the math but coded yourself. The farther you get, the better you'll be. I'd take someone great at analyzing algorithm complexity over someone well-versed in all sorts of specific algorithms, but if you made it even 1/3 of the way through the PE problem set on your own you could walk into most positions - at least the interesting ones where the hunters recognize the difficulty of what you accomplished.
Edit: If you google something for a problem there, and you see it says "Project Euler" on the search results, just skip that link. Cheating won't get you anywhere - you aren't graded, and it's primarily for teaching yourself to seek and solve new, challenging problems.
Among others also State Machine.
I'm awed by the .NET framework. Not being a fanboy... python & ruby are very impressive as well. The thing is, since I started working with .NET, I've never reached for a programming construct that wasn't there. These algortithms are solved problems, and codified and formalized, and handed to you on a silver platter. Enjoy them.
What this all boils down to is that every software endeavor has some essential complexity (essential to the problem at hand), and a set non-essential problems (technical implementation details). Over time, tooling will reduce the non-essential complexity that we have to deal with. Ask any .NET developer who ever wrote against the standard c library.
Software doesn't exist for software's sake outside of universities, and so as professional software developers, we should spend most of our time worrying about the essential problems our software is trying to solve: should we be looking at objects with these criteria, or those criteria, not the non-essential software oriented problems of how do we find objects according to given criteria.
Now that said, a wood carver needs a wide array of knives and chisels. He needs to know how and when to apply each one... TO ACHIEVE A DESIRED EFFECT, but I'm not going to hire a woodcarver just because he has a big toolbox.
"Tortoise and hare" cycle-finding algorithm [1]
[1] http://en.wikipedia.org/wiki/Floyd%27s_cycle-finding_algorithm#Tortoise_and_hareHow about a function to recursively search a directory structure for a file. Extra points:
I wouldn't expect and applicant to code this completely in an interview situation (I.e. I wouldn't expect the applicant to know the OS API functions to get directory items, or write a regular expression parser), but at least be able to give a rough pseudo code and explanation of what is involved and/or should be considered.
I was asked to write a trivial function in an interview for MicroSoft years ago, I was a little put off. I don't write code, I think about it and then I compose it, and I never compose code with somebody sitting there waiting for me to do something in a matter of minutes and certainly not with the foreign medium of pen and paper :-)
So, I would not ask an applicant to write code at the interview.
I would rather know what kinds of things interest him/her. What has he/she programmed at previous employment, school projects, or as hobby and what they learned from it.
I would ask an applicant to bring samples of code they've done
(sources and executables, or perhaps assign the above problem before-hand)
and explain it to you, describe the techniques that were used, and how it could be enhanced. This way you can see if there is a good understanding of techniques,
coding style and communication skills
(extra points if (s)he happens to bring documentation/manual :-).
I would like to give them the opportunity to demonstrate their problem solving skill and their passion for programming.
The circular/ring buffer for simple producer-consumer multi-threading. The Google-query (( circular ring buffer C++ )) fetches many hits, but once you get the idea of the in-and-out subscripts you just refry this basic concept to fit the particular requirement. For low-level situations it does of course help to use power-of-two sizing so that the & operator can do the modulo-buffersize operations.
Good suggestions. Big O notation is something I already know, but also a valid suggestion.
Keep em coming! Great stuff.
BTW, this question is not just to make me a better interviewee but ultimately a better developer, which is my long term goal. This question came up as I applied for a job and I was expected to know the following algorithms (it was a standard .NET developer on the web app side - ASP.NET):
Markov Chain Monte Carlo Stochastic Processes Statistical Analysis Optimisation Algorithms Discrete Mathematics Algebraic Structures Combinatorics
This was a graduate position. I guess the company wanted someone who knew all of this because of the nature of their projects?
Thanks Guys!
The Hodge Conjecture would be a good place to start.
Hmm
I'd specialise on what Drew suggested
-- Binary Search of a good library, on-line or paper
-- Quick Sort of what you find
After 30 years programming, I've forgotten most of the stuff that I ever knew, and place ever greater reliance on being able to find a good algorithm quickly. On the plus side, I've built up a library of code which tackles most of the basic problems, and many of the not so basic ones, in my domain.
What algorithms should any developer know? As many as possible!
Given that, you should just learn new algorithms and methods for solving problems in whatever area you are working on at the moment. If you just learn algorithms for the sake of learning algorithms, you won't remember or understand them well. It's always best to combine theory and practical application for the most effective learning.
Besides knowing algorithms -- you should know how to invent new ones! Someone had to come up with the algorithm originally, so it is possible to always come up with an algorithm for a problem (if it is solvable). Maybe one day it will be your algorithms that others are learning to get a job at Google or Yahoo or whatever replaces it.