What is the longest name that you should expect to get as input to your program or database?
I don't want to truncate unusual names, but I also don't want people to paste a novel in my name field as this could result in security problems. Has anybody ever been bitten by setting this field size too short?
Guinness World Record for Longest Name in the world belongs to a Mr. Adolph Blaine Charles David Earl Frederick Gerald Hubert Irvin John Kenneth Lloyd Martin Nero Oliver Paul Quincy Randolph Sherman Thomas Uncas Victor William Xerxes Yancy Wolfeschlegelsteinhausenbergerdorffwelchevoralternwarengewissenschaftschafe rswessenschafewarenwohlgepflegeundsorgfaltigkeitbeschutzenvonangreifeudurch ihrraubgierigfeindewelchevoralternzwolftausendjahresvorandieerscheinenersch einenvanderersteerdemenschderraumschiffgebrauchlichtalsseinursprungvonkraft gestartseinlangefahrthinzwischensternaitigraumaufdersuchenachdiesternwelche gehabtbewohnbarplanetenkreisedrehensichundwohinderneurassevonverstandigmens chlichkeitkonntefortpflanzenundsicherfeuenanlebenslanglichfreudeundruhemitn icheinfurchtvorangreifenvonandererintelligentgeschopfsvonhinzwischenternart Zeus igraum Senior, who was born in Munich in 1904 and lived in Philadelphia for most of his life. Apparently he shortened his name to Wolfeschlegelsteinhausenbergerdorff, and subsequently went by Hubert Blaine Wolfe, but the "Senior" indicates that he passed some form of his name to his son.
If you do not support some ones name you may cause them hardship. (rdentato) This may lead to a Tort cases; i.e getting sued.
Guinness World Record for Longest Name is 802 characters. (Dynite)
You must support Unicode characters in the name. (rdentato)
You must support Punctuation like Hyphens; in all parts of the name. (rdentato) (RichH)
Accept one word names. (RichH)
Accept one letter names. (e k)
The structure, layout and number of names are dependant on the culture. e.g. 'Latin Americans and Spaniards use 2 surnames" (japinedaf)
Most countries have a different conventions to the USA. First Name,Initial,Last Name is inadequate even in the USA.
Usually American names have 3 parts, Latin names have 4 parts, Arabic names have 7 parts. (Mike Post)
Don't limit titles/appelations. Mr, Mrs & Ms are insufficient. People can have multiple titles e.g. Rev. Dr.. (Matt Lacey)
Consider using varchar(max) and let the database deal with it. (TheOtherScott)
Consider any interactions with legacy systems.Such as SSIS uses nvarchar(255) when importing data from Excel.
Use a single large free form field for the complete proper name. Use another free form field for the 'preferred' name. (mat_geek) Implement a sophisticated modern search if the records may need to be located by name. (mat_geek)
And as always, sanitise your input for SQL Injection attacks. (Chris Ballard) See: XKCD [1] (jeff)
[1] http://xkcd.com/327/This is a great question, because its the kind of annoying but important decision developers face every day.
I recommend picking a reasonable limit and sticking with it. People with bizarrely long names usually have a shorter version that they will use if forced to. I would judge that 128 characters should be enough for any reasonable full name, and 64 characters should be enough for any christian name, middle name, or surname. Just make sure you enforce the same limit in the input field, UI code, database interaction layer, etc.
One great way to get a handle on these kinds of situations is to see how other people have dealt with the problem. Fire up some other applications or web sites from respected companies and see what they have done. Chances are if a solution worked for Google, it will work for you.
These days you'll also need to consider the possibility of Unicode characters in the name. Depending on how your users interact with your application, they could easily introduce these using a non-US keyboard. So make sure you accept them or reject them with an appropriate error message.
One project I worked on handled mortgage/legal documents. I recall someone being added to the system who had eight middle names and their full, legal name was 514 characters long. So don't be stingy if it's a business app :P
Interesting enough, just yesterday I was reading about a woman who lost her flight home because her name was too long to be accepted by the check-in systems (would not fit in the boarding pass, they say and also had non-ASCII characters in it).
The name is: Ulrika Örtegren-Kärjenmäki and you can read the full story online:
http://www.thesun.co.uk/sol/homepage/news/article1716800.ece
I guess it depends on what your intended audience is. Latin Americans and Spaniards use 2 surnames, which for some reason is very confusing for Americans - I really hate being called by my maternal surname rather than my paternal.
Problem is most American-produced software seems to believe everyone in the world uses only one surname and one or at most 2 personal names. And since in the Anglo sphere surnames + names are usually short, sometimes you get unfortunate results like not being able to submit your whole legal name (and having problems afterwards for your passport says otherwise).
I myself have 2 personal names and some people I know have actually 3 or more, though this is really rare nowadays. Now then, if I recall it correctly a Russian's full name has the personal name, the patronymic (your father's name plus -ova or -ovich) plus the surname proper. And then East Asian names (as well as Hungarian ones) have surname first, personal name last. And Icelanders don't really have surnames at all, they only use patronymics after their personal names (think of Björk Guðmundsdóttir or Leif Eriksson - literally "Bjork daughter of Gudmund" and "Leif son of Erik", respectively). And Greek (and some others) surnames have different masculine and feminine forms (e.g. Gatsiopoulos versus Gatsiopoulou). In most Latin America and Quebec, married females are still referred to legally by their single's surname. Etcetera.
If you intend your software to be used by English speakers only, 50 chars for name, middle and surname each may be fine. If you're willing to go international at some point, you should probably reserve far more space. I'd suggest also keeping surnames and names in separated fields, if possible use 2 surname fields (that's how we do it usually in Mexico). Probably even 3 if you want to add the option of "single surname" (or the opposite, married surname) for females. If possible, use the terms "family name" or "surname" rather than "last name".
All in all, it's not as simple a subject as it seems...
My wife has a hyphen in her first name and regularly comes across systems that only support hyphenated last names. Worry about your special characters and not just your length.
I also work with someone with just a single name (no last name ... or is it no first name?). Be careful with your validation too!
Arabic names can be as many parts as you want, I could give you my name in 20 or 30 parts, that would include many generations and family/tribe names.
It's basically a linked list of "first names", each pointing to their fathers. and a tree of families and tribes that belong to eachother.
we sometimes like to use the word bin, meaning son of.
You can call me:
(it's goes on infinitely, I have a list of many of these names on my blog).
That's why I don't like it when you have separate fields, I prefer to have one field called "Full Name", and the user would fill it in the way he/she would prefer. First/Last is something I don't mind, But I don't have a "Middle" name ...
Why did no one mention
Johann Gambolputty de von Ausfern-schplenden-schlitter-crasscrenbon-fried-digger-dingle-dangle- dongle-dungle-burstein-von-knacker-thrasher-apple-banger-horowitz- ticolensic-grander-knotty-spelltinkle-grandlich-grumblemeyer- spelterwasser-kurstlich-himbleeisen-bahnwagen-gutenabend-bitte-ein- nurnburger-bratwustle-gernspurten-mitz-weimache-luber-hundsfut- gumberaber-shonedanker-kalbsfleisch-mittler-aucher von Hautkopft of Ulm?
clicketyclick [1]
[1] http://www.youtube.com/watch?v=UDPqB9i1ScYvarchar(max) and let SQL Server sort out the rest!
I don't think there can ever be a correct answer to this as someone could be born tomorrow with the longest name in the world + 1. People with excessively long names are probably used to shortening them due to restrictions on size elsewhere in their lives, so I would go with your experience of the particular language/culture you're dealing with, add a bit more on top and hope you don't offend any edge cases by not giving them enough space.
on a related note...make sure that you can take single character last names. I have a few friends with the last name "O", who are always having problems signing up for new accounts anywhere they go
Something else to keep in mind:
Falsehoods Programmers Believe About Names
http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/
You need to declare what your requirements are. Is the name entered required to be legally correct? Is this a small embedded system or a gigantic intelligence database?
Since the content and layout of names is arbitrary and culturally dependant you should not hard code any assumptions.
Just provide one very large (1024) or unlimited size Unicode field for their proper name. Make sure it is only sanitised to prevent injection attacks.
Provide another sizable field for their 'preferred' name. This can be used on envelopes and in customer calls. It may bear no relation to their proper name, but that is irrelevant; e.g Hubert Blaine Wolfe.
Most of the requirement to order names is based on the need to be able to quickly find a record in a government record warehouse or in a phone book. So don't bother trying to group them unless there is an explicit requirement specifying and exact grouping scheme.
If you need to be able to find people by name implement a modern search. Return partial matches after the user has entered three characters; (but allow the user to search on one). And order the results so exact matches are first, matches on a whole word are next, then matches on part of a word are last.
There's no easy answer to this question.
The answer depends on what sort of data you expect to handle with your system. Do you expect only Americans to exist in your system? In that case Latin names could be considered your special case. But if you expect to deploy world wide you'll need to handle American names (3 parts), Latin names (4 parts), Arabic names (7 parts I think), Asian names (no idea on number of parts), etc. The ISO 8000 standard attempts to provide a framework to answer this sort of question, but does not answer it directly.
Also don't forget to account for the multibyte character sets. For example, a name written in the Traditional Chinese character set may be 15 characters, but those 15 characters may take upwards of 45 bytes. 1 character != 8 bits
If you're working with Western names written in Western alphabets, three fields of 256 characters each should cover most of your cases. Or you could follow the lead of the United States Postal Service for internal shipping labels: the combination of first, middle, and last names cannot exceed 32 characters including separating spaces) PDF link [1]
[1] http://www.usps.com/webtools/_pdf/InternationalLabelsv12.pdf6 characters max, all upper case and padded with spaces. The mainframe let's you store eight but let's not get ahead of ourselves. And, to save space only use two digits for the year.
I agree with the VARCHAR(MAX) statement above. I would have upvoted it instead of agreeing with it, but I guess I'm not allowed to do that yet. I don't think the length of your database field is going to compromise your security at all.
When I program I usually use 25 character string for the first name and 50 character string for the last name. Using those settings I have not yet run into any problem with it being too short.
The longest name in an NFL stats db I have is Chris Fuamatu-Ma'afala. But that only contains players with offensive stats.
I don't know why I even remember that.
I read an article on the BBC a while back about Sri Lankan brothers with very long names, who did find that they had trouble with some database systems. Hang on ...
Hm, can't find it. But the full name of the cricketer generally known as "Chaminda Vaas" is Warnakulasuriya Patabendige Ushantha Joseph Chaminda Vaas. And it's not just Sri Lankans - there is an English cricketer named Ebony-Jewel Cora-Lee Camellia Rosamond Rainford-Brent ("EJCLRC Rainford-Brent" on the scorecard, even though that reverses two of her names she apparently prefers it).
In practice, unless you're doing government work anyone with an extremely long name will probably use a shortened version of it for practical purposes.
I think that deppends on the country you are developing your application or should I say the culture you target.
I've worked on Mexico for the government and had the opportunity to come across these kind of problems and you wouldn't believe what you might find. People that were born before the 20's (there were records of these people even if they were dead) had very long names in Mexico. We used a 50 character standard for each field.
For example, my full name is Gustavo Adolfo Rubio Casillas which is "pretty long" for someone born in the U.S. for example where the mother's last name is not important but in latin american countries it is. Another difference is that many people in english-spoken countries abbreviate the middle name as in John F. Kennedy
I'm working now on a project for a car insurance company in the U.S. and I have access to the insureds database and the names seem to be shorter than the ones I worked with on Mexico so definitely it depends on the place where your application will be used, german names for example tend to be even longer than latin american ones.
And finally, it also deppends on the system/technology. For example, for the insurance application that we are developing we have to keep certain "standards" because the customer keeps their data in an iSeries server from IBM which uses the file-based AS400 database and they have more limits with the fields lengths than, say SQL Server. This information must be also exported to a file that uses an insurance standard that has its limits so no matter if our SQL server database or even the AS400 supports say, a 250 char length field, we have to truncate the data in order to fit in these XML files.
So summarizing it is a combination of factors, it's the place where your deployment will be made, the technology that you are using, the techology that you have to interoperate with and business rules.
I'd also like to mention titles (or appelations) as part of peoples names.
I've known companies who have lost business because they couoldn't address their customers properly. Mr, Mrs & Ms aren't enough.
Then there are people who have multiple titles. For instance, I know a Rev. Dr.
I had once to find an answer to this question while building a database for an international summit. I quickly found that longest family name are typically from madagascar (where some family names can reach 20 to 25 characters), while longuest forenames were usually found in France ("Paul-Henri Emmanuel" is a nice example). By combining the 2 infos, I decided that 52 characters (that was a long time ago, so don't ask me why 52 instead of 50 or 60!) was the correct size.
We finally registered 10 000 persons coming from approximatively 60 different countries. And it worked.
At the very least you want to make sure you can accommodate world famous, Emmy-award winning actor Kiefer William Frederick Dempsey George Rufus Sutherland [1] as a client.
[1] http://en.wikipedia.org/wiki/Kiefer_SutherlandNew systems are usually going to be expected to interact with legacy systems in some way, and that's a decent thing to base your decision on. That way, if anyone questions you later, you can always say you based your decision on software package X because your software imports/exports to that system.
If you have nothing else better to base your decision on, use nvarchar(255) because that's the default type when SSIS is importing data from an Excel spreadsheet.
Never make arbitrary decisions. Always base it on something, anything.
Also remember that relatively few countries follow the American first name, middle initial/name, last name pattern. In Brazil it's common to have more than one "first name" and more than one "last name", while few people would consider themselves to have a "middle name".
Good question. I suppose a follow on question would be how to deal with someone called Mrs. x'; DROP TABLE users; --
Looking at my VISA card (Danish) my full name (Thomas Angelbo Christensen) almost spans the entire card. That's 26 characters including spaces.
That's one form of restriction on name-length.
I always go with 256 characters for full names. Which I think is more than enough.
I'd also go with 256 characters, because it's a nice round number. 256 is kind of my default "good starting point" field length.
When I was in school, there were some south Indians in my class with names so long they would always be cut off on the class list. Here is a few examples: http://www.kamat.com/econtent/amusements/saythis.htm
Good Question.
I typically have fields for first, middle, and last names that are limited in size to what I would expect my users to input. Majority of the time I leave the 50 character default MsSql provides, but you have to take into consideration your users and type of application. Some applications may only need a nickname while others need full name. Plus, users from different parts of the world are going to have different average name lengths. Of course if your app is global then best to go with the worst case.
I am currently working on an application in which having the exact legal name will be important. After reading this, I am thinking of having an additional field of legal name that should, by the looks of Guinness World Records, be able to hold more than 791 characters. Maybe 1024 characters for a good round number and future considerations? Then I would use a combination of first, middle, and last for average display purposes whilst still providing the ability to store full legal name.
This is the modern world, storage is cheap. Why limit a name field to 30chars? Set the DB field to 1024 each for first, middle, last. For the interface, set the display size to something that's readable (30/30/50?) but keep maxlength at the DB limit.
This should work for enough people that the tail will be really small (there will always be exceptions unless you support an unlimited size).
"Pedro de Alcântara Francisco Antonio João Carlos Xavier de Paula Miguel Rafael Joaquim José Gonzaga Pascoal Cipriano Serafim de Bragança e Bourbom" was Peter I's complete name [1].
Anyway, Dynite answer is so much more funny. :-D
[1] http://pt.wikipedia.org/wiki/Dom_pedroI guess safe assumption is that a human rarely has longer than 256 characters long full name. 256 characters is roughly 3 lines, assuming you have 80 character-limit on each line.
Actually, for european names... they aren't often really long, perhaps even 64 characters would go and most people would be ok with it. If you want to make sure you could use 128 characters. (I'd be sure, since 128 characters are really nothing with todays memory standards.)
Unless you are writing a government database for mexicans (there you wouldn't limit the name lengths), I guess you could use 64 as the character limit for the name field.
To your knowledge, I don't believe anyone would have ever written a novel to their name field. It's not even useful anyway since they ought do that thousands of times to actually give a dent. It's somewhat equivalent to spamming though. Not anything I would care of, since often you don't allow links in the name fields.
I've got friends with a long, hyphenated last name that regularly get bit by this. I'd say 30 chars each for firstname and lastname, or 50 chars if they're in one combined field.
Overkill is good though, and storage is cheap to free these days, so take the above as a minimum and multiply by your favorite overengineering factor.
this link [1] has some interesting data on it.
[1] http://everything2.com/index.pl?node_id=1534419Just hope with don't get this user [1]
[1] http://en.wikipedia.org/wiki/Brfxxccxxmnpcccclllmmnprxvclmnckssqlbb11116I don't think maximum lengths are as important as minimum lengths. For every Johan Gambulputty* you'll likely get 100 very annoyed "I"s and "Wu"s and "O"'s
Wanted to add a quick comment re: sanitizing your names (see mat_geek's comment above).
Check out the following comic that help's show why it's important to protect your database from SQL injection.
I'll usually start the site with a generous length for both first and last name... maybe even a varchar(255)
. If there's no "find a user" type of feature in the UI you're probably not going to index the firstName
& lastName
fields so starting off big isn't a huge issue...
After a reasonable stream of data has come thru (x months worth, n records) I'll examine the db table and see what my users' contributed data is like and possibly make adjustments from there.
But really, unless you're indexing the data, its not that big of a deal if you have "fat" columns (in this situation). Disk space is cheap... pissing off a potential user/customer is expensive.
Doesn't that make all the applications (web/windows) unusable to accept a legit long name? :)
GLASTONBURY, England, Nov. 3 (UPI) [1] -- A British 19-year-old has officially changed his name to "Captain Fantastic Faster Than Superman Spiderman Batman Wolverine Hulk And The Flash Combined." The Glastonbury, England, teenager -- originally named George Garratt -- said his new name, which is thought to be the world's longest, has so outraged his grandmother that she is no longer speaking to him, The Telegraph reported Monday. The teen said he used an online service to officially change his name for a $20 fee. "I wanted to be unique," Captain Fantastic said of his name choice. "I decided upon a theme of superheroes."
[1] http://www.upi.com/Odd_News/2008/11/03/Teens_Fantastic_new_name_Super_long/UPI-90361225751268/Are you writing a general purpose application that'll be used by the entire world? If not, just pick the logical choice based on your application's potential userbase. Otherwise, use varchar(max) as suggested. Worrying about the relatively few odd people with extremely long names will get your nowhere.
The longest ordinary german name I came across is "Sabine Leutheusser-Schnarrenberger", a politician in Bavaria. Just one first name and a last name, nothing fancy.
Realistically, the longest names I've come across are German, African and perhaps some other European names, but I would set a VARCHAR limit of 40 for FirstName and LastName. You can also create additional columns like: LegalName, ShortName, AliasName, etc. to augument a persons identity. Here in Puerto Rico, as in other Latin-American countries, they add SurNames (Mothers maiden name) when identifying a person. Example: JUAN RIVERA RODRIGUEZ (please note that they are similar to compound last names which are also sometimes used in USA , e.g. EVERT-LLOYD, but inserted only into a LastName column.) All app I have developed have SurName VARCHAR(1,40) to further identify a particular person.
If you need to ask this question then you're obviously not at 7th normal form.
Ideally you would create two tables
LastNames +-------------------------------------------------------+ | LastNameID | ThirtyOrSoCharacters | NextLastNameID | +-------------------------------------------------------+
When NextLastNameID = -1 or null or something, then you're done. Don't try sorting or querying this table or anything.