The center for all Wikitravel images!

Tech:CAPTCHA for registration

From Wikitravel Shared
Jump to: navigation, search


Swept in from en:pub:

Going through the recent changes and noticing that 9 of 10 new users are spambots, could we/IB add a captcha to the process of creating a new account? AHeneen 20:26, 21 May 2009 (EDT)

Good idea. Should be easy to do using either of these: Jpatokal 03:22, 22 May 2009 (EDT)
Man, those bots are out of control. I've mailed IB about this. Jpatokal 06:19, 22 May 2009 (EDT)
I could see having a captcha for account creation, or if things got worse and worse maybe even for edits from unregistered accounts, but I'd hate to make the average registered user go through that for every single edit. Texugo 00:42, 10 June 2009 (EDT)
My intention was simply to add a captcha for new accounts. AHeneen 15:19, 10 June 2009 (EDT)
We will work on getting this implemented for you. Thanks, Ibsteph 9:46 am, 12 July, 2010
Just wanted to inform you that we have installed the " re-captcha" functionality for you. It will trigger on all new registrations and anonymous edits. Thanks Ibsteph 11:40 am, 21 July , 2010 PST
Wait! This was not supposed to trigger on all anonymous edits! (Only user registrations)
Also, this does not seem to be true on Wikitravel Shared—was re-captcha enabled only on :en? --Peter Talk 13:03, 23 July 2010 (EDT)
This really needs to be enabled here on Shared. We are starting to get a quite a bit of user creation spam here. texugo 11:22, 2 July 2011 (EDT)
I agree. Over the past month, I think have had at least 100 registered spambots on Shared, and they continue pouring in! I suggest you activate CAPTCHA on all language versions. Riggwelter 09:51, 14 July 2011 (EDT)
Incidentally, it definitely needs to be implemented on es: as well. texugo 12:12, 14 July 2011 (EDT)
And it has now begun on sv:. Please activate this on ALL language versions! Riggwelter 10:20, 26 July 2011 (EDT)
This is in the tech queue and will be implemented soon. Targeting August/September, becasue we may wait until after the Mediawiki upgrade. I will update if I have an earlier ETA.--IBobi 13:19, 10 August 2011 (EDT)
Will this be possible to do in the local languages? Otherwise it could actually really get in the way of confused contributors. --Peter Talk 13:49, 15 August 2011 (EDT)
Assuming that reCaptcha is being used then my understanding is that it should use the language detected from the browser, but that would (obviously) need to be validated. -- Ryan 14:24, 15 August 2011 (EDT)

REcaptcha has been enabled on Shared and all foreign language versions as of this morning. Please confirm, and report any bugginess with the install. Thanks all,--IBobi 13:58, 20 December 2011 (EST)

There have been zero spambot user accounts on shared today - for the past several months before today the site averaged 15-70 such accounts each day, so the CAPTCHA implementation appears to have been successful. -- Ryan 16:14, 21 December 2011 (EST)
That's fantastic!--IBobi 17:10, 21 December 2011 (EST)
Better late than never, I wanted to thank you too IBobi for the REcaptcha enabling on foreign language versions. The French version has been indeed much cleaner and easier to patrol for one month now. And Happy New Year by the way! Joelf 02:33, 26 January 2012 (EST)
I'm really glad to hear that, Joelf -- this year, following the MW upgrade and booking tool, I'd like to circle up on how we can improve the foreign language versions; I've heard anecdotally that some aspects of those pages (editing?) has dropped off of late, and with such a large portion of WT users coming in from non-english countries I'd love to see those pages really shine.--IBobi 13:31, 26 January 2012 (EST)


Also swept from en:pub:

There has been a huge increase in accounts created by spambots of late, so I've made a request that IB enable CAPTCHA for all account registration. By my count there are already 30+ spambot accounts created today on shared:, and the day is barely half over. I know some people hate CAPTCHA and that there may be some objections from those who primarily use non-English languages, so this thread should hopefully serve as a location for anyone to raise issues. For my part, spending time each day deleting spam pages and blocking accounts on a site that is already horribly slow isn't an enjoyable way to spend time, and isn't something that I want to do on Wikitravel for much longer. -- Ryan • (talk) • 17:55, 21 November 2011 (EST)

Actually, looking at this closer it appears that CAPTCHA is already enabled for user registration on English Wikitravel (although not on other versions), so some bots have apparently defeated that defense measure. It would still be good to get this enabled on all versions, barring objections, but it looks like further defenses may be needed. -- Ryan • (talk) • 18:12, 21 November 2011 (EST)
This is an update we have been waiting to do until after Mediawiki, as we believe it may resolve itself then. We have the site running on the current version of Mediawiki internally right now. As soon as we can, we will be enabling it externally so that a number of Admins can test functionality before we push it live.--IBobi 21:31, 21 November 2011 (EST)
As an aside, are these usernames being blocked correctly? I've seen some blocks which I think have imposed an infinite ban on the IP of the originating spambot in addition to an infinite ban on the username. The latter is reasonable, but the former wouldn't be, since these botnets often use dynamic IPs, which may be a real contributor one day. Perhaps I'm misunderstanding the block text, though. --Inas 00:26, 22 November 2011 (EST)
I'm probably guilty of blocking the underlying IP. At least on shared: any blocked IP block can still comment on the corresponding talk page so I figured the benefit of blocking a known-bad box exceeded the disadvantages, but if there is concern let's put something in policy and I'll be more circumspect. -- Ryan • (talk) • 00:30, 22 November 2011 (EST)
Typically (and I think vaguely according to policy) we put a three month block on the IP of a spambot, rinse and repeat? --Inas 00:44, 22 November 2011 (EST)
The last bullet point in that section is "Blocks of user accounts created by spambots. Some of the more advanced spambots are actually capable of creating user accounts. These accounts should be permanently blocked as soon as they are identified as being spambot accounts." That doesn't explicitly state what to do with the underlying IP, which I think is where the difference of usage is coming from. -- Ryan • (talk) • 01:59, 22 November 2011 (EST)
True enough, Are you (or anyone else) saying you think we should block the underlying IP? --Inas 03:09, 22 November 2011 (EST)
To this point I've been blocking the underlying IP since any bot capable of creating a user account will likely just create more, but if others disagree with that approach I'm happy to change it. -- Ryan • (talk) • 10:46, 22 November 2011 (EST)
The same reasoning could be applied to bot on an IP editing anonymously, but we don't block that indefinitely. There seems some inconsistency in blocking a bot IP for 3 months, unless it creates an account in which case we apply an indefinite block. --Inas 17:08, 22 November 2011 (EST)
(Re-indenting) Here's my reasoning: an indefinite block for a registered-user-spambot is important because after X number of days that account's edits will be automatically marked as patrolled, it will be able to move pages, and do other things that an IP account can't. We've already seen blocked spambot accounts return to make multiple edits, so we know that's a possibility. When blocking the account, unless I'm misreading the user block screen, there doesn't appear to be a way to permanently block the user account but only temporarily block the underlying IP address - if that's wrong please correct me - and the IP address could therefore just spawn new spambots until the end of time if it isn't also blocked. Thus we're put in a position where we could either block a known spambot account for three months and then have to deal with a privileged spambot, or block the spambot and IP permanently and have a slight chance of forcing a real user to eventually ask on a talk page to be unblocked (note: not counting exceptions for shared IPs, there are 256^4 IP addresses which is approximately 4.3 billion, so odds of blocking a real user aren't extremely high). Given those choices I'd prefer the latter, but as mentioned am happy to do whatever the consensus dictates. -- Ryan • (talk) • 18:10, 22 November 2011 (EST)
Given this fight against the spambots is so dynamic, I think we should use the best ways to fight the problems that we are actually seeing, with the toolset that we have, with minimum collateral damage. The privileged spambot I don't see as an issue right now. We don't see this happening (as yet), and if the spammers were intelligent enough to actually pursue this, then they may also realise that all they have to do is wait until the auto-confirm period without making a spammy edit to achieve the same result, we just don't have the toolset to combat it.
Shared IPs aren't so much the exception. I'd say 99% of botnets are running on dynamic or shared IP addresses. We risk blocking corporates, mobile ISPs, education campuses, and even entire countries that use shared IPs.
Presumably this was the motivation for making our policy to block spambot IP for 3 months only.
I'm also happy to go with a consensus, but I'd strongly argue that I haven't seen any policy or consensus emerge to block IP address indefinitely.
Given we don't seem to have the toolset match to what we want to do, I'd suggest that we just periodically review the blocked IP list, and remove the blocks on the IPs (but not spambot accounts) that are older than three months. Sound reasonable? --Inas 19:23, 22 November 2011 (EST)
I'm 100% fine with unblocking IPs that have been blocked for more than 3 months, but do we have a tool that shows blocked IP addresses? Special:Ipblocklist isn't showing IPs for recently blocked spambot user accounts - is there another tool, or will the Mediawiki upgrade provide better tools? -- Ryan • (talk) • 19:44, 22 November 2011 (EST)
I've just played around with this.
Firstly, the outcome of a block of the user, including the IP and account creation is fairly harsh. You can't edit talk pages as an IP, you can't create an account, it directs you to the admin who did the block, but you can't email them, or leave them a message. Essentially, it is a lost potential user. If I can't figure out how to leave a message to get unblocked, then a normal user will give up.
You seem to be able to unblock the IP no problem. You'll see in the block log there is a line for the user, and then an additional line for each IP address that has been blocked. If you leave the user line and remove the others, the user remains blocked and the IP address and account creation is fine again. --Inas 21:24, 22 November 2011 (EST)
Two things: first, I'd support changing policy to specify unblocking spambot IPs after three months as you've described - hopefully the Mediawiki upgrade will make this easier. Second, it seems like it's only on English Wikitravel that a blocked user can't edit his/her talk page - as the spam on shared: shows, plenty of blocked IPs are still spamming talk pages; this will hopefully be another issue that is resolved with the upgrade. -- Ryan • (talk) • 22:21, 22 November 2011 (EST)
It does look like mediawiki does this without assistance. the ip blocks associated with the user blocks appear to be very shortlived, maybe as short as 24 hours. accordingly, doesn't look like any policy issue arises. --Inas 03:44, 27 November 2011 (EST)
Hi, I was about to ask if admins here or IB could enable CAPTCHA on the French WT: there are indeed many spambots and vandalisms; and I'm one of the very few people patrolling this version. Hopefully, that'll make things easier. Thanks Joelf 01:46, 30 November 2011 (EST)
Only IB can do this. Suggest a tech request on shared, follow up with email. --Inas 02:44, 30 November 2011 (EST)


reCaptcha has been cracked (see among many [1]). At the moment the spam on English Wikitravel is still manageable, but there should be an expectation that at some point in the future the volume of spam will again increase significantly. Hopefully Google will find a way to address this in their reCaptcha implementation, otherwise other spam handling techniques will probably need to be considered - what options are available I don't know, but others who are more familiar with Mediawiki might have suggestions. -- Ryan 18:03, 3 February 2012 (EST)