In addition to "I never met a man I didn't like", Will Rogers had another great little ditty I've always remembered. It went:
"It's not what you don't know that'll hurt you, it's what you do know that ain't so."
We all know or subscribe to many IT "truisms" that mostly have a strong basis in fact, in something in our professional careers, something we learned from others, lessons learned the hard way by ourselves, or by others who came before us.
Unfortunately, as these truisms spread throughout the community, the details — why they came about and the caveats that affect when they apply — tend to not spread along with them.
We all have a tendency to look for, and latch on to, small "rules" or principles that we can use to avoid doing a complete exhaustive analysis for every decision. But even though they are correct much of the time, when we sometimes misapply them, we pay a penalty that could be avoided by understanding the details behind them.
For example, when user-defined functions were first introduced in SQL Server it became "common knowledge" within a year or so that they had extremely bad performance (because it required a re-compilation for each use) and should be avoided. This "truism" still increases many database developers' aversion to using UDFs, even though Microsoft's introduction of InLine UDFs, which do not suffer from this issue at all, mitigates this issue substantially. In recent years I have run into numerous DBAs who still believe you should "never" use UDFs, because of this.
What other common not-so-"truisms" do you know, which many developers believe, that are not quite as universally true as is commonly understood, and which the developer community would benefit from being better educated about?
Please include why it was "true" to start off with, and under what circumstances it's not true. Limit responses to issues that are technical, where the "common" application of a "rule or principle" is in fact correct most of the time, or was correct back when it was first elucidated, but — in the edge cases, or because of not understanding the principle thoroughly, because technology has changed since it first spread, or applying the rule today without understanding the details behind the rule — can easily backfire or cause the opposite of the intended effect.
You need to know all of your requirements ahead of time because it's too expensive to change things later in development.
In reality, no one ever knows all of their requirements ahead of time and you can develop code in such a way as to mitigate the inevitable changes and new requirements. This might not be as much as truism as it used to be now that Agile development methods have gained currency.
Java is slow
Lines of Code is a good way to track productivity of your developers and overall project health.
Never hard code any value.
#define FOURTY_TWO (42)
- LiraNuna
if (results.Count() > 0)...
really bothers you so much that you'd use a macro or a variable to hold the value. - tvanfosson
#define MEANING_OF_LIFE_UNIVERSE_AND_EVERYTHING (42)
is a valid use case? - Jonathan Day
Programmers at the same level are completely interchangeable
How about, Unit-testing doubles development time
You don't need to worry about security until later on in the project.
Documentation can be written after the software has been deployed. (We'll have time to do it then)
One Entry One Exit
Your user interface doesn't matter so long as the code works.
C++ is slower than C
Everything should be done in stored procedures
or inversely
Never use stored procedures
There is one True way of programming that's suitable for everything, and any other way is always wrong. Mostly seen among OO or functional fanatics.
Big-O Notation: O(1) < O(n)
We all make this mistake -- especially me :)
I can't find the post, but I remember reading a microcontroller blogger who described a case where his hardware needed to store some key/value pairs. Performance was critical and a hashtable with constant time lookup seemed to make sense; if I remember correctly, this setup performed quite well for years.
Out of curiosity, the programmer swapped the hashtable with an unsorted linked list, which easily beat the hash table for dictionaries < 20 items. Later, a sorted array and binary search, with O(lg n) lookup, absolutely demolished the hash table with items less than 500 key/value pairs, although slightly slower than a linked list for less than 10 items.
Since the original hardware never stored more than 15-30 keys at any given time, a sorted array replaced the hash table and our blogger becomes dev team hero for a day.
Our project is going to miss it's deadline!....quick lets throw more people onto the project! (ie Mythical Man Month)
Reference types live on the heap, value types on the stack
"SQL in code is bad! Get the SQL out, and then we're good on data access." This simplistic thinking contains some truth but causes a lot of problems. Good data access strategy is sooooo important.
Never, ever use a goto cause they're harmful.
This was originally cited as "true" because it was noticed that code with lots of gotos was poor in quality.
This is an example of attacking the misused tool (anyone for try/catch?) instead of the real problem, which is being unable to recognize and prevent unmaintainable, poor-quality code.
The one that irks me the most: Published "best practices" work for everyone.
Malarky.
Every company is different. The staff is different, the business model is different, the clients are different, the fiscal outlook is different, the culture is different, the politics are different, the technology is different, the long and short term goals are different, and on and on and on.
What works for one company will not necessarily work for another company. And I cannot repeat this enough: There is no silver bullet. Just because some guy (or some group of guys) wrote it in a book and slapped a fancy title on it does not make it irrefutable, beyond reproach, or an iron-clad guarantee that it will work in your situation.
You should carefully review any given "best practice" (or mediocre practice, for that matter) for its suitability for what you're doing, where you are, and where you're going before you even think about putting it in place.
Two words, folks: Risk analysis.
Microsoft IIS is insecure / Apache is secure
You hear this one a lot too, but the criticisms of MS/IIS security are about 10 years outdated. Compare vulnerabilities on Secunia [1]:
Apache
Microsoft IIS
To look at it another way, there is a well known article from Mar 2008 [9] which summarizes some findings by Netcraft and Zone-H. Although there are 1.66x as many Apache sites as IIS sites, Apache sites are defaced 2.32x as often, so the ratio of attacks to site is about 1.4. The Slashdot reaction to this article [10] is worth reading.
[1] http://secunia.com/Use a simple editor or IDE and you will be productive at once.
Not spending your time learning hotkeys, regex-based editing and other power features of a professional tool may save you some days and will cost you hundreds of them.
The more design patterns you use the better.
Applying design patterns can make code better, and it's great to have a shared vocabulary for developers. However, many solutions don't require patterns, and knowledge of patterns is no substitute for understanding algorithms, data structures, and the fundamentals of problem solving.
Reflection (in .net, not sure about Java) is very expensive and therefore extremely slow, hence it should be avoided at all costs.
SQL Server specific: Stored procedures perform better than dynamic SQL because they're precompiled.
Don't know how many times I see this one, but its wrong.
See SQL Server 2000 documentation [1]:
SQL Server 2000 and SQL Server version 7.0 incorporate a number of changes to statement processing that extend many of the performance benefits of stored procedures to all SQL statements. SQL Server 2000 and SQL Server 7.0 do not save a partially compiled plan for stored procedures when they are created. A stored procedure is compiled at execution time, like any other Transact-SQL statement. SQL Server 2000 and SQL Server 7.0 retain execution plans for all SQL statements in the procedure cache, not just stored procedure execution plans.
See SQL Server 2005/2008 documentation [2]:
When any SQL statement is executed in SQL Server 2005, the relational engine first looks through the procedure cache to verify that an existing execution plan for the same SQL statement exists. SQL Server 2005 reuses any existing plan it finds, saving the overhead of recompiling the SQL statement. If no existing execution plan exists, SQL Server 2005 generates a new execution plan for the query.
SQL Server creates an execution plan for all SQL statements on their first invocation, then caches the execution in memory for future use. Apart from edge cases where network latency slows down transmission of huge SQL strings over a network, there is no performance benefit gained by using stored procedures over dynamic SQL.
[1] http://msdn.microsoft.com/en-us/library/aa174792%28SQL.80%29.aspx"Premature optimization is the root of all evil" Knuth
In print it is very often used without the context of the full quote.
Additionally, neither of the two people who are said to have created it, (Hoare is the other) do not claim to have created it.
I typically associate the above quote with laziness and excuses when I hear or read it.
The full quote (whatever the origin):
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil."
The difference (by the the added qualification) is huge.
Pair programming means double the development cost!
Pair programming. What researches say on the costs and benefits of the practice. [1] would be a source to counter that.
[1] http://agilesoftwaredevelopment.com/blog/artem/pair-programming-what-researches-sayComputers are really clever and will solve any problem we encounter.
From what I've seen over the years, there appears to be two distinct groups of people: those who think computers are really clever and those who think computers are really dumb. Unfortunately, most people think the former is true when in fact computers are really dumb - they do exactly what we tell them do, even if that is to start a global themonuclear war.
Skizz
Performance-related falsisms:
That is fine for monitoring program health, but pinpointing problems is not about measuring. It's about finding cycles that have poor reasons. This does not require running fast. It requires detailed insight into what the program is doing (typically via sampling as much of the program state as possible and understanding in detail why it's doing what it's doing at each sample time).
Typical performance problems worth pursuing take from 10% to 90% of execution time. (That is how much execution time is reduced after you fix them.) The object is to find the problem, not to know precisely how big it is. Even a small number of random-time samples is virtually guaranteed to display the problem, assuming they are taken during the overall time span when the performance problem exists.
It only matters in code that 1) you actually compile (as opposed to libraries), 2) you actually spend much time in (as opposed to code that spends all its time calling functions, explicitly or implicitly).
Always use stored procedures.
Exponential-time algorithms are slower than polynomial-time algorithms.
In linear programming [1], the simplex algorithm is exponential, but it is typically much faster than its polynomial ellipsoid algorithm counterpart.
[1] http://en.wikipedia.org/wiki/Linear%5FprogrammingBased on a paper from 1978, people quote that maintenance is 20% corrective, 20% adaptive, and 60% perfective. These percentages came from a survey of managers' opinions, and no empirical evidence. In 2003, another group of researchers (Stephen R. Schach, Bo Jin, Liguo Yu, Gillian Z. Heller and Jeff Offutt) challenged this by studying maintenance data for Linux, RTP, and GCC, and found wildly different numbers. See their paper here: Determining the Distribution of Maintenance Categories: Survey versus Measurement [1].
[1] http://cs.gmu.edu/~offutt/rsrch/abstracts/LST-maint03.htmlFrom the premature-optimizations department:
Denormalize your schema up front because normalized schemas are too slow and full of joins to be usable in the Real World.
PHP isn't a language you should use for serious websites.
Staying late and working overtime is the only way to make deadlines.
...sure, until you are so bloody exhausted you can barely see straight and the excessive caffeine leads to the shakes or a mental kernel panic.
... to heck with better planning/doing actual estimates/setting more realistic expectations.
Low-level languages (Assembler, C) produce faster code than high-level languages (C++, Java, OCaml). Often when you show people benchmarks that prove the opposite, they even think there's some kind of "trick" involved, because "nothing can be faster than C except assembly, right?"
Design your application from the ground up: start with the database model.
Number Of Bugs per Line Of Code measures Quality (yep, not so true or relevant in the practical world as we know it today)
Business Development Guy: "If I can write the spec, then anybody can write the spec...so anybody can build my product"
Static typing and strong typing are the same thing.
There are plenty of languages that are strongly and dynamically typed out there; Python is a particularly popular example.
We can defer this bug as long as we document it in the release notes.
A more recent one :
Don't bother with that, hardware is cheap, we'll buy more servers.
Yeah, hardware is cheap. But when you buy a server, you have to pay a price every month for hosting and/or electricity and/or bandwidth. And you add an extra cost to your maintenance too. You spend more time for migrations and deployments.
Yes, hardware is cheap to buy, but unless you are a cloud-computing-virtualisation-sys-admin hero, owning a new computer has significant cost.