share
Meta Stack OverflowLet Me Flag That For You - URL Shortener Cleanup
[+96] [11] Time Traveling Bobby
[2011-07-20 18:59:11]
[ discussion hyperlinks clean-up ]
[ http://meta.stackoverflow.com/questions/99136/let-me-flag-that-for-you-url-shortener-cleanup ] [DELETED]

The Request

I set out on the Crusade for the Holy Grail of no-redirections (named by Denilson Sá [1]) to eliminate every shortened URL [2] on Stack Overflow. But the foreign hordes are overwhelming me, and I need reinforcements!

The Mission

Find every shortened URL, follow it, check it and inline the unshortened/long version into the post, like this:

For details see the documentation here: http://tinyurl.com/3zbelpa

Turns into:

See the documentation for further details [3].

Watch out for hidden LMGTFY links [4] and flag them for Moderator Attention if the answer only consists of those. If the answer does also hold valuable information like other links or further information, edit the LMGTFY-Link out, and leave the above link for future reference.

If you encounter links to duplicates, vote to close/flag the question, and edit the answer anyway.

But, watch out for traps! As Joel Coehoorn♦ [5] informed me, there are edge-cases which need the existence of shortened URLs. Including but not limited to:

Make sure that you're not breaking any of these by inlining them.

If you find shortened URLs in comments, there's at the moment not much we can do about them. LMGTFY and other objectionable content needs to be cleansed flagged none the less.

Also there are shortened URLs hiding in the woods (known to some as "Code"), these are no danger to our lands and can therefore be left unchecked and unchallenged.

The Tools

Our most valuable light in the darkness of the night is the search (roughly sorted by number of hits, cleaned ones at the bottom (doesn't mean that they don't come back)):

Feel free to expand and edit that list.

While paying a visit, Rob Hruska [26] showed us the way to a magician which allows us to see beyond shortened URLs without fear for ourselves [27]. Jeff Mercado [28] also showed as an apparatus which does the same [29]. These are only tools on our crusade to protect and help ourselves, we need to cleanse the land from redirections nonetheless.

The Reward

Eternal honor and glory...and of course you can keep every captured Flag-Weight and Badge.

(3) Here's another site that can expand the links for us for many different sites: longurl.org - Jeff Mercado
(15) While fixing these, please don't just put a "+" after the shortened link to go to the shortener preview page; inline the full site URL. - Rob Hruska
@Rob Hruska: I seriously hope nobody gets that idea. - Time Traveling Bobby
@Bobby - I've seen a few and have fixed them. - Rob Hruska
@Rob Hruska: Oh dear...I'll try to point that out. - Time Traveling Bobby
(1) `@Bobby and @Rob: Sorry, that was probably me. I didn't realize my mistake until I had already "fixed" about a dozen posts. - Chris Frederick
(1) @Chris - NP, I think I got most of them. - Rob Hruska
(5) That explains the edit queue number. I've never seen it that high before. - mmyers
Looks like bitly alias j.mp is in use: stackoverflow.com/search?q=body%3Aj.mp - about 60 hits. - martin clayton
(6) Reminder: funky Wikipedia links and links that include URLs in the querystring will usually be properly encoded if copied out of the Firefox or Chrome address bar. IE users can continue to sell their souls to URL shorteners, or get a real browser. - Shog9
Would sf.net be a candidate in this clean up? - martin clayton
(2) I guess we should extend this quest to other sites from the stackexchange family! - Denilson Sá
(2) Some time ago we looked at http status codes for a sample of the links found in stackoverflow posts - over 10% returned 301. - martin clayton
@martin clayton: sf.net redirects me to SourceForge...I'm not sure what you mean? But if you mean if we should remove a an address linking against sf.net instead of sourceforge.net, then yes, I think so. Inlined long links are always a good thing. - Time Traveling Bobby
(2) This question title delivers the lulz. When you need a break, take a moment to enjoy everything that live.lmgtfy.com has to offer. - Brad Mace
@Bobby - that's exactly what I was thinking. URL shorteners aside, there are quite a few permanent redirects for links in SO posts. For example I think all http://github.com links redirect to the https://github.com equivalent. That one is not so bad, but such redirects are perhaps evidence of the onset of link rot. - martin clayton
@martin clayton: I wouldn't include such things in this crusade. We're here to end the life of all real URL-Shorteners. Especially the github-case is not a problem at all in my opinion. The not-so-problem with the sf.net links is, that we know where it redirects us to. We don't know that about the other services. - Time Traveling Bobby
(1) We need some way to educate our users. I've seen the hit count for some URL shorteners go up from yesterday! - Denilson Sá
@Denilson Sá: I'll try to come up with a reasonable feature-request...the ban was already declined (with a very good explanation why, though). So maybe we can get something different in place, like a warning. - Time Traveling Bobby
Chrome extension to expand shortened URLs: View Thru - Al E.
Flagged some and edited some. Be nice if these URL shortners were banned. - staticx
wp.me is another one... I added him to the list. - fretje
Do url characters count in comments? They seem to. If so, what happens if the url contains one of those nasty, long slugs so that it seriously cuts into the available number of characters for your comment? or you want to include several long links in your comment? Typically, I'd shorten it to avoid having to make multiple comments. Perhaps SO needs it's own shortner a la twitter that automatically shortens links in comments and only counts the characters in the short url. - tvanfosson
@BrAvada Kedavran: That feature request was denied due to technical problems. I have to say that Jeff outlined the problems very well. - Time Traveling Bobby
@tvanfosson: Yes, they do. There's at the moment not much we can do about comments anyway. - Time Traveling Bobby
ow.ly is down to 12. I think it has been sufficiently cleaned up. - staticx
(5) As a part of this effort, could a dev remove the shortened url from the post-ban message please? - M. Tibbits
(1) What do we have to do when Jon Skeet post a tinyurl? stackoverflow.com/questions/4344336/… - Cyril Gandon
(1) @Scorpi0 - I want to edit it, but I don't dare - shanethehat
wp.me is all cleaned up. - agf
Now that IBM has their own custom shortened URLs, it is permissible to post those? As I noted in a comment below, IBM has hundreds of SupportPacs all with 4-char names. If you know that they live at ibm.co/SupptPacs and that SupportPac XXXX is at ibm.co/SupptPacXXXX then you can get to them from any browser without resorting to search. The mnemonic names have value. Just how blunt an instrument is this policy intended to be? - T.Rob
stackoverflow.com/questions/3826983/… needs to be closed, it relies on the shortened link and its target is now dead. - agf
j.mp all cleaned up. - agf
What about shortened URLs to Google Books? They fail a lot when clicked. - Verbeia
What I understand the least about these shortened URLs are the posts that use mostly full-length links, and then only a single, or two shortened links among them. - Nightfirecat
What do we do about shortened URLs that don't point anywhere anymore? (e.g. Destination site no longer exists) There's nothing to replace them with. Flag as too localized? - John
(1) @Verbeia: Try to replace them with a working version, if not possible leave them in place (for now). - Time Traveling Bobby
@John: Remove them as dead links. If the answer isn't useful anymore after that, flag it. - Time Traveling Bobby
(1) alturl.com is taken care of. - Nightfirecat
(1) @PaddedCell Maybe we should stop removing them, just list them with zero instead? I'm pretty use t.co was on there and cleared previously, and now it's been re-added. - agf
@agf: True, then we'd also have a reference list of URL-Shorteners and the names of the brave souls which cleaned them up. - Time Traveling Bobby
(1) If you use the reference syntax (which IMO is a lot cleaner), it won't break any links. - NullUserException อ_อ
Ok, just noticed... How are we supposed to get rid of the cl.ly links, when they link to resources? (screenshots, files, etc) They don't point to other websites. - Nightfirecat
@Nightfirecat: What do you mean? Just inline them like every other link. You can always use LongUrl.org to obtain the long version. - Time Traveling Bobby
@PaddedCell: That's the thing - they don't have long versions. They host images or files on their own servers. longurl.org/expand?url=http%3A%2F%2Fcl.ly%2F5f5 - Nightfirecat
@Nightfirecat: Facinating. I'd suggest to re-upload the images using the SO-Tool. As for the downloads, check if they still exist, inline them so that it is clear that it is a file download. In that case it's not use as URL-Shortener but file-sharer. - Time Traveling Bobby
Interesting problem I'm encountering - ran into at least one user who rolled some of his (previously URL-shortener-including) posts back to their previous states. I obviously went and rolled back to an edit that expanded the shortened links, but how should these posts be dealt with, outside of just rolling back? - Nightfirecat
(2) @Nightfirecat: Despite the ensnaring thought, do not engage into edit-wars of any kind. If somebody rolls back these changes, only roll back once and leave a comment with further explanations. If the user rolls back again, flag for Mod Attention. Though, my heart bleeds if I think about bothering the mods with such peanuts, it's better then to engage into a vendetta with a user. Don't forget, we have the blessings of the Diamonds with us. - Time Traveling Bobby
Not using indirection (short-links are a form of indirection) is a terrible idea, for the same reason that hard-coded data in an application is a bad idea. Some short-link providers follow links when they change. Not to use too broad a brush, but this sort of wooly-thinking has led to a lot of really bad software in our time. - Terry Gardner
@TerryGardner: Pardon me? I can't follow you. - Time Traveling Bobby
I purposely use third party url shortners in order to work around this issue: meta.stackoverflow.com/a/98950 - Michael
@Michael: That's abuse...but thanks for telling us. - Time Traveling Bobby
[+22] [2011-07-20 20:11:50] Denilson Sá

That's great, but we also need some way to remove URL shorteners from the comments.

During this quest of the Holy Grail of no-redirections, I've found many short URLs in the comments, and it makes me sad that I can't do anything to fix those.


Awesome title! I have to get this somewhere into my question! And you're right...we can flag LMGTFY-Comments, but can't really do anything about the others, I fear. We'll see, maybe a Mod comes along with a good idea. - Time Traveling Bobby
Eh,.... LMGTFY? - GUI Junkie
(23) @GUI Junkie, what? You don't know LMGTFY? lmgtfy.com/?q=LMGTFY :-) (Okay, I admit, this was kinda mean…) - Denilson Sá
I'm smiling now - GUI Junkie
I thought there would be viagra somewhere for me - GUI Junkie
(3) If you flag the comment for mod, we can do the heavy lifting ourselves I guess. I forsee a shitload (more) mod flags and edit queues in my future... - Mark Henderson
(2) Yep. Go ahead and flag them in comments, and a mod can edit the comment. LMGTFY comments can be flagged for removal. - nhinkle
(7) So far the only legitimate use for a url shortener that I know of is the one Jon Skeet uses when hinting to question askers that they should improve their question. It even has a much friendlier name. tinyurl.com/so-hints vs msmvps.com/blogs/jon_skeet/archive/2010/08/29/…, I'd prefer the shortened version as long as I know it's safe and has a friendlier name than a random hash. - Jeff Mercado
@Jeff - and so as not to take up too much of a comment when including long links or multiple links. - tvanfosson
1
[+21] [2011-08-17 13:15:01] Gilles

Thanks to the people who are going through the database and fixing posts. But please don't blindly replace URLs by the longer URLs, take the time to go through the posts review them.

  • Do not replace shortened URLs if they are part of the question. Example: Split Twitter RSS string using Python [1] — the question is asking how to parse a string that happens to contain a short URL; replacing it by the longer URL would not make any sense.
  • If the URL was directly in the text (and was intended as a link, not as part of the question as above), don't leave [http://bit.ly/abcde](http://real-url.example.com/wibble) in the markdown, take the time to write a real description for the link, like [the description of the `wibble` command in the official documentation](http://real-url.example.com/wibble).
  • If the URL is to an image, upload it to Stack Exchange's image hosting (press Ctrl+G, click on from the web and enter the URL image).
  • If you see other problems in the post, such as a signature or spelling errors, take the opportunity to improve the post.
[1] http://stackoverflow.com/questions/1354415/split-twitter-rss-string-using-python

2
[+17] [2011-07-25 14:42:59] Tim Post

Please remember, don't bother with posts that only contain a link. If there is no context around that link (if the link breaks, is the answer still useful?) .. flag the post as a non-answer.

I just deleted a few non-answers that had been edited to expand the actual URL, I hate to see people waste time :)


Go ahead and edit the link anyway, if you have edit permissions - no point in the link staying in that form for any amount of time regardless. - Nightfirecat
3
[+6] [2011-07-20 19:04:37] Joel Coehoorn

You'll likely get better search results for posts entered prior to the last data dump via SEDE [1]. And add a search for http://tr.im/ — It's defunct now but was still active when Stack Overflow first launched.

Aside from that, we've had this discussion before (can't find the link right now) and from that I'll caution that there are a few edge cases where the shortened urls are required. Examples includes links into archive.org's wayback machine, browsershots.org, and certain wikipedia links (though I think the last has since been fixed).

[1] http://data.stackexchange.com/

Good points about the links, I'll add that. - Time Traveling Bobby
I'm quite interested to see that previous discussion. I can't imagine why you would need obfuscated URLs for links to the Wayback Machine to work. The regular links work just fine for me across sessions, and as I understand it, the obfuscation services just store that same URL to be expanded later. - Cody Gray
@Cody - the issue is that you have to include the full url for the original page as a query string parameter for the wayback machine, and the regex kept choking on the second http:// . A quick check shows they've made improvements since then, but I'd still be wary of edge cases. - Joel Coehoorn
Since tr.im is now defunct, how can we "fix" those links? Or... We can't?! - Denilson Sá
(1) @Denilson - In many cases, we'll be able either infer where they were pointing or run a google search to find an equivalent link. In others, we may just need to flag the post for moderator deletion or edit that part of post out. - Joel Coehoorn
4
[+6] [2011-08-16 07:12:55] Marek Grzenkowicz

There's a list of 339+ URL shorteners [1] at the LongURL site.

[1] http://longurl.org/services

5
[+5] [2011-07-20 21:51:57] rightfold

Maybe somewhat less used, but CloudApp has a URL-shortener called cl.ly.


You just hit a pot of gold! 232 hits on SO. Thank you very much! - Time Traveling Bobby
(2) Huh... Most times I find a cl.ly link, it's actually for image hosting. Thus, I don't believe those image-hosting links should be removed. - Denilson Sá
6
[+5] [2011-07-20 22:29:47] GUI Junkie

I admire the zeal of the quest and would bang my head through a rock before ever adding a short.url. However, would it not be possible for the asker to propose a more automated approach? It would surely be a flick of the wrist for the SE programmers (Superero without H) to replace that for a 'click here'?

And let me please finish with... amiright?

Edit As George comments, the approach would be to do that on the fly.


(3) Better yet, have the system check the answer for shortened URLs before the answer is posted. - uɐɯsO uɐɥʇɐN
(1) Actually, as the original question proposed, not only we are replacing the short URLs, but are also adding relevant link texts. Oh, and from what I saw, there are so many different cases that an automated approach would be a pain to write, and would still fail a lot. - Denilson Sá
I was getting at what George is suggesting, but alas, brevity is always mine enemy. - GUI Junkie
@Denilson, it may not be trivial. But a link has either a text that can be respected o a link that can be replaced with a 'click here' textlink. The end result might not be so nice as a manual edit, but would be workable. - GUI Junkie
7
[+5] [2011-07-25 13:13:50] Al E.

Another nefarious purpose for shortened URLs: hiding amazon.com affiliate tags. (Other affiliate tags as well, presumably.)

I just fixed one of those. I'll not be surprised to find others.


8
[+2] [2011-07-20 22:15:01] Caleb

I wasn't aware that shortened URLs were considered poor form. Personally, I always use full URLs in questions and answers, but often use shortened URLs in comments due to the character limit.

If SO policy is to eliminate shortened URLs, then we should:

  • Tell people. I don't know if SO has a mechanism for making announcements, but there should at least be a mention of this in the FAQ.
  • Stop counting URLs against the character limit in comments.
  • Filter new questions, answers, and comments to catch shortened URLs.

Before embarking on this crusade, however, I'd want to know if shortened URLs are really a problem on SO. I understand the theoretical security risk, but in practice how often does it happen that someone posts a shortened URL on SO that points to something other than what's represented? I'd expect those cases to be pretty rare, and I'd hope that such posts would be downvoted into oblivion.


(13) Point 1: a short URL that points to another question in StackOveflow won't show up in "Linked" list of questions at the right side. Point 2: it is better for the user experience to know the destination of a link. Sometimes I decide if I should follow or not a link based on the URL itself. Point 3: Shorteners don't improve the user experience in any way. Actually, as I said, they degrade the experience. Point 4: Shorteners are usually used by new users that still don't know how to correctly insert links. - Denilson Sá
(1) 1: Fair enough, but links to other SO questions aren't terribly long anyway -- I'm not sure I'd bother shortening such a link. Links to documentation on other sites (very common in SO comments) can be quite long and shortening them helps in comments. 2: I agree, but on the other hand it's better for the user experience if I can use as many characters as I need to in order to explain something. 3: See 2. 4: Rather than banning shortened links, then, perhaps we should help the new users? I'm not arguing for more shortened URLs, just not convinced that eliminating them is useful. - Caleb
(3) Side note: there is a short form for links to SE posts: site.stackexchange.com/q/postID, e.g. meta.stackoverflow.com/q/99164 for this one. - nhinkle
(10) @Denilson - you forgot the big one: Point 5: link rot - if/when the shortening service goes belly up (eg. tr.im), then all those links become dead & the question/answer/comment they're used in becomes, at least partially, useless. - Alconja
(9) They're poor form everywhere. I'm not going to click links that I don't know where they point. That's just irresponsible. - Cody Gray
@Alconja: True, but that also happens on non-redirected links. Anyway, the lesser is the number of redirections, the lower is the probability of link rot. - Denilson Sá
(2) @Cody - they're not poor form on Twitter, they're absolutely essential when you're character limited. Granted, comments are as limited as tweets, but long urls can seriously cut into your comment space, esp. if you include multiple links (such as to ref. docs). - tvanfosson
(1) @tvanfosson: I don't think Stack Exchange has bought Twitter yet. Can't wait, though; it'd really help that site out to redirect the focus on content rather than noise. - Cody Gray
(3) "Shorteners don't improve the user experience in any way" I'd disagree with this point at least. IBM has hundreds of SupportPacs all with 4-char names. If you know that they live at bit.ly/SupportPacs and that SupportPac XXXX is at bit.ly/SupportPacXXXX then you can get to them from any browser without resorting to search. I can see why they were expanded but I'm hoping that if can transfer these form bit.ly to IBM's URL shortener that I can restore the mnemonic links in my answers. - T.Rob
9
[+2] [2011-10-04 01:14:53] Erwin Brandstetter

I was just now haunted by an unearthly creature in the form of: "http://kjkh.me/oXek9p". It crept out of this dungeon [1]. In my bewilderment I turn to the noble crusadors of LMFTFY for council. What is the righteous path for for a lowly peasant?
Downvote? Comment? Friendly advice? Refer to this page?

[1] http://stackoverflow.com/questions/7606386/how-to-restart-id-counting-on-a-table-in-postgresql-after-deleting-some-previous/7610991#7610991

(2) Edit it to link to the actual page. I just did. Downvote, comment, friendly advice optional. - Jeff Mercado
10
[+1] [2011-08-10 23:34:05] Chris Frederick

It has been brought to my attention that you can search for links using the url: option [1]. I don't know precisely how smart this functionality is, but it does seem to ignore links embedded in code. We might be able to leverage this functionality to automatically detect shortened URLs before they are ever posted.

[1] http://stackoverflow.com/search

11