So, I'm not 100% behind this competitive stuff. Stats, ranking, "biggest", "bestest"... I'd love to measure the progress of our project as a whole, not make a rat race between the different language versions.
I know that this game is played a lot on Wikipedia, but it kinda makes me queasy.
I wonder if we could make the same kind of competitiveness, but instead do it as some quality factors for the whole project. Like:
number of total articles across all language versions, counting interlanguage-linked articles once.
number of articles where an interlanguage link exists for all languages; that is, where the article is present in all language versions.
I'm not particularly behind having a tough competition between the Wikitravel versions (even though, as you said, Wikipedia does this a lot!), even though it's been proven that it does edge people on to write more articles. For example, I'm a regular contributor on Romanian Wikipedia, and at the start of December we had about 1,600 articles, just behind Hebrew Wikipedia. However, the presence of the Multilingual stats made the Wikipedia a lot more aware, and article growth spurred up to around the point of 2,800, where we are now. The Hebrew Wikipedia also grew from this competition to around 2,600 articles.
So, even though it's not particularly good to have a "rat race", competition does encourage innovation and growth, resulting in a better and more comprehensive Wikitravel
Competition aside, of course it would be great to measure things like total number of articles across all languages. In fact, I don't know how it would be done, but Wikipedia runs a new stats system which includes all sorts of statistics (updated weekly) on things such as number of images, number of interlanguage links, etc, and these can be measured across the whole Wikipedia project. Maybe something similar could be done for Wikitravel, but, as I said, I'm not really sure.
Anyway, the whole Multilingual statistics page I made now was really just a stub, since there aren't any multilingual Wikitravels now. And even though Romanian is set to launch very soon, and French pretty soon as well, you can't really have a page that only looks at the article count and stats of three versions of Wikitravel. So, the whole thing with the stats will probably only start properly once we get 5 or more Wikitravel versions, which I'm not really sure will happen in the near future - I mean, the French Wikitravel Expedition isn't really going anywhere (though I support it and think it's a great idea, it really hasn't moved forward for days).
And, concerning Interlanguage links, will [[ro:Article]] work if you add it to a page. Of course, I'm not going to add any links to Romanian Wikitravel until the certain article there is finished, but I'm just wondering how the interlanguage process will work.
I'm also in a pretty good rapport with the Estonian Wikipedia - I will tell them about Wikitravel and hopefully we can also start an Estonian Wikitravel Expedition.
Cheers (and sorry for writing such a long message ;-) Ronline
Amazing how things change in less than one year, now isn't it! =) Now we have 5 language versions and I'd bet that before the end of the year we'll be up to 5000 articles total.
I'm thinking about writing a little bot to update this page automatically — is there any more elegant way to do this than just grabbing all 5 Special:Statistics and extracting the figures? Jpatokal 02:17, 13 Sep 2004 (EDT)
Testing if date is now fixed... it is! Jpatokal 11:06, 13 Oct 2004 (EDT)
Race to 5000
8 weeks left to go until the end of the year and 839 new articles needed to reach 5000. This translates into an average requirement of 105 articles/week — for comparison, last week's growth was 111 but the previous week's was only 59. Time to run a country bot on the Swedish edition... =) Jpatokal 21:44, 4 Nov 2004 (EST)
Took one month longer than I was hoping, but the February 4th, 2005 count now says 5003. Yay! Up next, 10000? 8) Jpatokal 20:09, 3 Feb 2005 (EST)
So I just punched in the tabular data into Excel and drew a couple of graphs. Some casual observations:
Wikitravel's growth in article count appears linear or close to it, while the growth in traffic and the number of edits seems exponential. So evidently a lot of effort is going into improving existing articles, which is great! And if each language version has linear growth X, the total growth becomes 6X (or rather 4X since ro/se aren't growing).
The comparative share of English has hovered amazingly steadily at 70%, although it seems that a turning point was reached around June 2005 — odds are that it's all downhill from here.
The blips for country stub robogeneration for de (in Nov 2004) and ja (in Aug 2005) are clearly visible.
Prediction time: if growth really is linear, which wikis usually aren't, it will take 26 weeks (est. Feb 2006) for the total number of articles to cross 10000, and 95 weeks (est. Aug 2007) for the English version to reach 10000. These are almost certainly too conservative.
10,000 articles was hit in December 2005 (2 months ahead of schedule)
Updated forecasts based on the last 6 months' data:
English language share drops below 50%: October 2006
10,000 articles in en: January 2007 (7 months faster than earlier predicted)
20,000 articles total: April 2007
10,000 articles in non-en: July 2007
I'd wager a fairly significant sum that we'll see 10,000 articles in en: before the year is out, and more probably than not the 20,000 total as well. Note that the language share figure follows a different curve from the rest, so that one I'm not so sure about — these aren't interpolated, just averaged. Jpatokal 11:50, 15 Jan 2006 (EST)
My hypothesis is that the article growth in Wikitravel is exponential, but the base is not the number of articles, but the number of usable articles. The reason is that a Wikitravel article has to cross a higher threshold then a Wikipedia article before it starts becoming useful. Even a one-sentence article giving the definition of a term is useful as a WP entry, while a one-sentence article about a destination is useless to someone planning to go there. So if the reason for expecting exponential growth is that as more people find an article useful, the more they will get involved, the number of usable articles is a better proxy for usefulness than the number of articles. I don't know what practical conclusions one can draw if the hypothesis turns out to be right - may be it means that there was some value in the conscious decision to keep information on one page. --Ravikiran 13:37, 15 Jan 2006 (EST)
Interesting. I think you're onto something here though -- the number of edits seems to be increasing rapidly, while the growth in the number of new articles created per day is comparatively slow. This also indicates that people are putting more effort into existing articles. Jpatokal 02:58, 16 Jan 2006 (EST)
I think you've got a good point. It will be interesting over the coming months to see how article status changes. --Evan 10:30, 16 Jan 2006 (EST)
So, I've tried to calculate some info on growth. Here's my stats (all for the main namespace in en:):
It seems like new articles, as a percent of edits per month, spikes in the first couple of months, hovers around 10% for the first year, then dips to about 5% in the second year. It will be interesting to see where this goes in 2006. --Evan 13:44, 16 Jan 2006 (EST)
A couple more data points: in July 03 we had 4 edits in the main namespace, 2 of which were new articles. Aug 03 was the CIA factbook import. I'm not sure about the other spikes, but my guess is that they correlate with media coverage. --Evan 14:01, 16 Jan 2006 (EST)
Yes if quality of articles is more important than the number, edit count is more important than article count as a factor in spurring further growth. Also, I guess, the fact that the number of places in the world is limited also plays a role. It would be interesting to find out how we get new users and what makes them stick on. Is it time for a user survey? --Ravikiran 02:26, 17 Jan 2006 (EST)
Preparing for exponential growth
Friends, citizens, wikitravellers, I think Wikitravel is starting to hit exponential growth.  In three short months daily pageviews have almost tripled from under 50k to nearly 130k. The time to lay the groundwork for more expansion is now.
How is Wikitravel's hardware and network infrastructure set up to handle this — how many boxes and how big is the pipe? What is the financial basis behind Wikitravel — does Evan still pay the bills out of his pocket? For both questions, how does the current setup stand up to 1 million page views per day — which at current growth rates will be hit by the end of 2005?
I would like to propose, at a minimum, opening up a donation box. I would also not be averse to adding something like Google Adwords, but I'm sure opinions may differ on this: let's hear them out. Jpatokal 01:40, 3 Mar 2005 (EST)
Google offered to host Wikipedia. I would rather have Google host Wikitravel, than having ads on Wikitravel. With the new maps.google.com they could be interested in travel. Then again it seem to be working fine so far -- elgaard 05:37, 3 Mar 2005 (EST)
Our network service can handle a little more than 20 times our current bandwidth requirements (300Gb/mo compared to 14Gb/mo for Feb), and about 50 times our current disk space requirements. So even if we keep doubling in size every two months, we should be OK into Q4 05.
As for money: right now, donations would be more of a hassle than they're worth. Wikitravel costs Maj and I a not-insignificant amount, but dealing with the bureaucracy and taxes for personal gift donations right now is not worth it to us. I know, it seems rude to say that we don't want money, but we really don't. One of my goals for Wikitravel's upcoming 2-year birthday (!) is to have the legal infrastructure set up for further growth.
My big questions for growth are social. When we've had large waves of new users, we've had problems communicating our style for articles. How can we scale Wikitravel's social aspect to (say) 10,000 or 100,000 users? --Evan 07:29, 3 Mar 2005 (EST)
Well, it seems to me that the biggest problem is article format. Having thought about this problem, it seems to me that what is needed is a group of sort of sub-admins, really just regular users with specific responsibilities, who would be formally "in charge" of watching a group of articles (likely, one or two countries), and modifying edits by new users to conform to our desired style. The watchlist feature makes this a fairly easy job, and one I that I think a lot of users already do in a less formal way. I know I keep an eye on most of the Ireland articles, and a few in Alberta. I figure this is fairly easy to organize just by having a page which lists all the countries of the world, and encourage two or three people to put their names under each country. Simply by making it an expectation that these people know the MoS properly, and expect them to do their job, I imagine it would get done. The signup if mainly to help ensure we cover the whole world.
I see a couple of problems with this:
To the best of my knowledge, you can't put pages which are linked, but have not yet been created, onto your watchlist. If we can't modify mediawiki to allow this, then the best solution I can think of is to make a list similar to the "Recent changes" page which only lists articles with the N tag. (Call it "recent additions", maybe). It would also be great if the same page could show where it was linked from, so it would be easier to identify what's in your territory.
Ideally there will be a lot of people involved, and in the internet generation many helpful people are completely incompetent spellers, or use their homonyms wrong (guilty), or, in my case, will intentionally ignore uses of proper (commonwealth) English. Thus, while I think this can solve the problem of formatting, I suspect many of the sub-admins would leave spelling mistakes. This seems like a somewhat more difficult problem to solve, as I think our current spelling gurus are already unable to vet every edit, and I suspect that the "army" of spelling gurus will grow much more slowly than the army of well-intentioned new users.
It's a thought though. Variations could include people specifically signing up to do spelling and grammar for a country, though I still think that this would result in a small number of people sifting through a lot of edits.
I don't think this will be an unsurmountable problem: as wikis grow, their quality generally also gets better, because they get more regulars who 'care' about their articles and keep them in good shape.
That said, I have thought along the same lines. Wikitravel's format is a lot stricter that Wikipedia, and it's tough to enforce for all newcomers — practically every new article and new listing has to be reformatted to suit the template. I'll again suggest:
Have standard page templates automatically injected when a user creates a new page (not too hard to implement)
Use single-listing templates to enforce consistency in listings, pref. with a helper tool to create and edit them (a lot more work, but worth it in the long run?)
Don't know how difficult this is to implement but we need some kind of wizard. Ideally, when you create an article it should ask you what page it links from (to avoid orphans), Then is it "Region, large city, small city ...." If none, should ask "Can you sleep here?" If the answer in "No", then the article should not be created. Then ask for "Name of article" and then auto-create the appropriate template with the first line filled in based on name of article, the type of article and what links to it i.e. Name of article is a (type) e.g. region in [[(page that links)]] . This will catch the dead-ends. The whole approach will avoid the ambulance at the bottom of the cliff approach where you are always playing catch-up. --Nzpcmad 13:51, 6 Mar 2005 (EST)
I think injecting a template may allow creative writing to flow too freely. I agree that the blank page approach may be more intimidating for first time writers who have never encountered a wiki before. You click on a red link and, hey presto! your being asked to write about the place that you were expecting to read about. If this doesn't give you writers block, probably nothing will. However, I think it may have hidden advantages. I suspect many people just hit the back button on the browser, the first time, a few follow a help link, and read what to do, and only the courageous (or foolish) start typing straight away. If we had some template or wizard in the article, people might think that all they had to do was click save to see the page, as it would be written for them or be copied from somewhere magical in cyberspace! We would then have to keep track of bad articles in some way as they wouldn't show up as an orphan or dead-end page. This is how I track down almost all the articles I copyedit. Dead-end pages identify pages written by users who do not know how a wiki works and so have not linked the page. Orphan pages identify pages where someone has been silly with their browser and either deleted a page or a link to the orphaned page. New pages show the most recent pages created, often by anonymous uses, while short pages highlight those pages where there is some text, but not enough for an article to be really good start. Newly created articles should show up on these special pages, so they can be given special attention by an experienced editor. Today I have copyedited quite a number of articles, adding templates etc. In most cases I can turn the submitted text into a reasonable stub. As a result there are just 4 dead-end pages that I cannot do anything with, (and I think two of those are bugs). I would rather have the wiki highlight new contributions from inexperienced users that show up as dead-ends and orphans than have a first time contributors be assisted so much that their simple mistake slips by unnoticed, only to rear their ugly heads a few years later as a systemic problem that developed through a lack of user understanding. - You mean people actually sit down and write this stuff, its not just assembled from websearch results. - Huttite 07:21, 25 Mar 2005 (EST)
You're got this backwards. Why should us 'experienced editors' have to waste our time on hand-adding templates, and need tools for finding articles where this should be done? They should be in there by default. I find it tedious that I always have to pull up Quick X template page and cut-n-paste the thing, I don't think it's at all reasonable for newbies to know where to find them, and it's obvious from most new pages that the average newbie is "courageous or foolish" and plunges forward with flow of consciousness that is quite annoying to reformat into a template. Jpatokal 11:09, 25 Mar 2005 (EST)
Regarding spelling. My spelling's not great (I read loads! It just doesn't seem to go in, or I don't pay enough attention). How hard would it be to add a spell checker? Obviously there would be loads of words which were valid which it would pick up (proper nouns etc...) - but it might make spellchecking easier, and encourage the original editors to correct their own mistakes. (Sorry for all the spelling mistakes in this message :P ) -- Lionfish 0:35, 7 Mar 2005 (GMT)
If you have problems with spelling use firefox (if you are on linux or Windows) and add the spellbound extension (it does spell checking). It helps a lot. Aburda 17:41, 25 Jul 2005 (EDT)
Exponential growth? It looks to me like the number of visits is falling:
Average Daily Visits
This could be because either:
Now summer's come to the majority of wikitravel readers, people are spending more time in the sun, and less time hunched over their computers.
Contributers have run out of stuff to add that they know about (will we have to wait until after everyone's been on holiday for a new batch of contributions?)
The previous 'highs' were just short lived spikes due to links from /. etc.
People have come here, and have become disillusioned for some reason?
Or it could be because the Webalizer stats were messed up due to some issues on the ISP side. They should be fixed now but (at least) the May stats are missing big chunks.
All other ratings seem to show increased traffic: the number of edits and new articles keeps growing (see Wikitravel:Multilingual statistics) and even Alexa shows us growing constantly. Today's rank is 16,533, last year we didn't usually register in the top 100k. Jpatokal 08:04, 15 May 2005 (EDT)
Oh good! Was getting worried there. Thanks! -- Lionfish 15:21, 15 May 2005 (BST)
Alexa has us at 7705 now. Pashley 05:47, 21 June 2006 (EDT)
Trivial request, but would it be possible to exclude *.js, *.css, *.phtml, *.ico , *.txt from the Webalizer Top URLs listing, and maybe increase Top 30 to Top 50 or 100? It would be very interesting to see what the most popular content pages are. Jpatokal 03:34, 29 Oct 2004 (EDT)
Just wondering whether there is some way for the community to be able to get an automatic update (at any time) on the number and names of actual "active contributors" to Wikitravel - maybe define this as anyone who has made at least one edit in the last month....? This would be an interesting and possibly useful guage of interest and activity overall. Ideas? Pjamescowie 02:18, 10 May 2005 (EDT)
You'd have to do some fairly hardcore data crunching Wikipedia style for this — they generate their stats from an SQL dump each week. Jpatokal 13:12, 11 May 2005 (EDT)
Could we use their model to do the same for WikiTravel? Any SQL-savvy volunteers out there? Pjamescowie 05:38, 13 May 2005 (EDT)
I don't think we'd have to -- our database server isn't as crunched as Wikipedia's. Can you explain to me what the point of this is? I'm not sure I like the idea, since it seems to be aimed towards discriminating between "active" and "non-active" contributors. But it's not a hard query and I can drop it in as a Special page if it's really needed. --Evan 08:26, 13 May 2005 (EDT)
Purely as a guage to see how we're developing as a community.... I don't see how any kind of discrimination could creep in, really... People are free to contribute as much or as little, as often or as infrequently as they wish.... So what? If you don't like publicising names, then don't! Maybe just provide the numbers.... Don't see what harm it would / could do - and it'd be interesting to see how our contributions are growing month by month....Pjamescowie 09:42, 15 May 2005 (EDT)
10,000 articles in all languages
Have we passed it already? --Ravikiran 11:08, 21 Dec 2005 (EST)
Yes, but the official count will only come out on Friday. Jpatokal 12:06, 21 Dec 2005 (EST)
I think that in the last couple of months, we have passed an inflexion point in terms of contributor involvement and quality of edits. — Ravikiran 08:18, 27 May 2006 (EDT)
I've been graphing it which makes for a useful exercise. It turns out growth has been pretty much exponential -- so it looks like a straight line if you plot the number of articles logrithmically. However, a certain event caused a significant disruption in this trend. This past week's data looks like a recoverey from the disruption may be beginnning... but I'm not sure yet. Check out the number of new articles before and after the disruption. -- Colin 12:56, 27 May 2006 (EDT)
Any chance of you uploading the graphs? I was going to do it again next week, but no point in reinventing the wheel. Jpatokal 23:47, 27 May 2006 (EDT)
I was doing it with Excel, so the graphs suck big-time. -- Colin 02:48, 28 May 2006 (EDT)
So was/will I, so it can't possibly look much worse than this... Jpatokal 02:54, 28 May 2006 (EDT)
Ok. I did it at work, and it's a three-day weekend here so I can upload it either Monday night or Tuesday. -- Colin 13:34, 28 May 2006 (EDT)
Thanks for the graphs! You're right, you can definitely see the exponential curve forming, even when looking at English alone (which excludes the effect of all the new languages starting up). It's clear we're going to reach 10k in English way before next January -- I'd guess July-August -- and 20k total will probably be reached by the end of the year. German growth also continues to be rocketlike, while Japanese is creeping up towards the 1000 mark and even passing French. But English is growing so fast that its 60% share isn't budging anywhere... Jpatokal 00:17, 31 May 2006 (EDT)
We covered more than half the remaining ground to 10K this week. But the ache in my wrist says I won't be making quite as many new articles this week. -- Colin 20:35, 29 June 2006 (EDT)
Moved to Shared
So I've now reengineered StatScript to upload new data here instead, and while I was at it, I made the listing more compact and added automatic deltas to the previous week's figures. This will make it a little easier to see who's growing and who's stagnating.
If somebody out there has a lot of time to spare, feel free to manually add deltas for the existing data as well... Jpatokal 23:19, 20 August 2006 (EDT)
Good move. And I loved this deltas thing too. Thanks! --Ricardo (Rmx) 10:10, 21 August 2006 (EDT)
The server that runs StatScript's cronjobs is currently pining for the fjords, so the stats will be delayed until it's back up... Jpatokal 04:24, 8 September 2006 (EDT)
Evan and Maj have given StatScript a new cardboard box to live in, so expect new statistical goodness again this Friday. Jpatokal 11:35, 20 September 2006 (EDT)
Thanks again to Colin for the new graphs! That log graph is downright scary, because it means the number of articles doubles every year and this is going to keep on happening. The pattern is so exact it's spooky: for en, article count at the end of year has been ~1000 > 3118 > 6498 > ~13000.
The English share seems likely to finally dip below 50% at about the same time we hit 25,000 articles total. Saying this will happen by the end of the year is pushing it, but it's not very far away... Jpatokal 00:58, 17 October 2006 (EDT)
I'm a little concerned by the linearity in English growth of late, but maybe it's just seasonal. It seems to me the Total should have taken a hit due to English's slow growth, except that the deficit has been masked by the number of new language versions coming out of the gate.
And wtf is up with Italian wikitravel? That growth rate is kinda high.
And the last thing to note is that it seems like growth is erratic until 1300-1500 articles, after which growth generally smooths out. I wonder if this is a critical level at which the project no longer seems like a skeleton, but seems like something wonderful instead. -- Cjensen 01:06, 17 October 2006 (EDT)
Italian is booming because Gobbler is creating hundreds and hundreds of outline articles, so it's not really sustainable, but it's a good base for further growth.
I'm not sure we can make grand pronouncements about versions above 1500 articles, because only two have hit the mark so far (en and de). New language versions ofter have huge leaps because of robo-generated articles. But I think the experience of de, fr, ja shows that, once you have around 1000 articles, continued slow growth is guaranteed even if some key contributors leave — whereas ro has been stuck at an insufficient <500 for ages now. Jpatokal 02:29, 17 October 2006 (EDT)
Well, I've said this before, but, the number of articles won't grow exponentially for long for any language version, simply because the number of places on earth are limited. But the quality of the articles is likely to grow exponentially. — Ravikiran r 03:24, 17 October 2006 (EDT)
There may be a limit, but we're a long way from hitting it. For the US alone, there are 195,000 FIPS place codes. If we assume that the world has 6.7 billion people and US has 300 million people of them, we can extrapolate that there are about 4.3 million distinct "places" in the world. Even if that's off by an order of magnitude (which is entirely possible), we've still got another 6 years of logarithmic growth to go... Jpatokal 04:11, 17 October 2006 (EDT)
Question: is Wikitravel Shared undergoing comparable growth? It doesn't look like it is yet, but here's a place where massive growth would really be helpful to the other wikis. Articles with satisfactory photographs are still in a distinct minority. I'd be very interested in charting trends here, if one of you guys has the tool for the job. -- Bill-on-the-Hill 09:56, 17 October 2006 (EDT)
Actually, I think it is — there have been over 3000 files uploaded since Shared was created less than half a year ago, which I think is very respectable. The uploaded file stats are available in the same raw output that I use to generate the main article stats, so I'll hack StatScript at some point to log those too (probably on a separate page). Jpatokal 13:44, 17 October 2006 (EDT)
New version graveyard
Interesting pattern of the day: the three Wikitravel versions that usually show "0" as the number of articles added have 487, 465 and 470 articles. This seems to be a difficult inflection point between the enthusiasm of youth (where people add entries for the sheer joy of doing it) and the wisdom of adulthood (where the guide is actually useful)... Jpatokal 01:04, 22 December 2006 (EST)
Since Jpatokal is now claiming to be psychic for successfully predicting when we would reach 25K articles almost an entire month ahead of time (heh), I think for sheer entertainment value we should all now make predictions about where we'll be at the end of 2007. Please add/edit your prediction below before the end of January. Working hard to add new useful articles in order to make your prediction come true is encouraged.
Total 44K, en 19K. -- Cjensen 14:54, 5 January 2007 (EST)
Total 50K, en 20K. Mostly because I like round numbers. (Having the source code to a stub autogenerator also helps.) Jpatokal 01:33, 6 January 2007 (EST)
It took a bit later than I expected, but on 30 Mar 2007 the English language share dipped to 49.75%, meaning over half the article on Wikitravel are now in other languages. Jpatokal 11:48, 30 March 2007 (EDT)
That's good. Hey Jani, could you help out on the hi: wikitravel by programming a bot to auto-generate country articles. It would really help. You're very good with programming and stuff. Upamanyu 08:06, 18 April 2007 (EDT)