User:Tatatabot/Scripts/pywikipedia/modification
From Wikitravel Shared
Contents
|
Information of this page is outdated. See setup for Wikitravel instead. |
To align with Wikitravel's script policy, you need to modify several scripts of Python Wikipedia Robot Framework.
- Modifications below are based on pywikipedia revision 6949.
- A patch file for pywikipedia revision 7292 is here.
[edit] user-config.py
When you setup your bot for the first time, you need to creat a file "user-config.py" in "pywikipedia" directory.
mylang = 'en' family = 'wikitravel' usernames['wikitravel']['en'] = u'USERNAME OF YOUR BOT' put_throttle = 60
If you get bot flags on more than one language version of Wikitravel, you can also add several other usernames lines like this.
[edit] family.py
Comment out meatball's line where code is setting "self.known_families" within "__init__" of "class Family".
'mbtest': 'mbtest',
# 'meatball': 'meatball',
'mediazilla': 'mediazilla',
Comment out world66's line where code is setting "self.known_families" within "__init__" of "class Family".
'wookieepedia': 'wookieepedia',
# 'world66': 'world66',
'wowwiki': 'wowwiki',
Add lines below, next to "languages_by_size" within "__init__" of "class Family".
# for Wikitravel's /Run subpages check.
self.wt_script_policy = []
[edit] wikipedia.py
[edit] _getEditPage
To escape from a peculiar error of Wikitravel's MediaWiki setting, add a condition below.
# for Wikitravel's /Run subpages check.
## if not matchVersionTab:
if not matchVersionTab and not self.site().family.name == 'wikitravel':
[edit] botMayEdit
Add below to the top of "botMayEdit".
# for Wikitravel's /Run subpages check.
wt_family_name = self.site().family.name
if wt_family_name == 'wikitravel':
wt_lang = self.site().lang
wt_username = config.usernames[wt_family_name][wt_lang].encode('utf-8')
wt_username = urllib.quote(wt_username)
wt_namespace = self.site().family.namespaces[2][wt_lang].encode('utf-8')
wt_namespace = urllib.quote(wt_namespace)
wt_user_run = unicode(wt_namespace) + u':' + unicode(wt_username) + u'/Run'
wt_page = Page(self.site(), wt_user_run)
wt_user_run_text = wt_page.get(get_redirect = True)
wt_namespace = self.site().family.namespaces[4][wt_lang].encode('utf-8')
wt_namespace = urllib.quote(wt_namespace)
wt_script_policy = self.site().family.wt_script_policy[wt_lang].encode('utf-8')
wt_script_policy = urllib.quote(wt_script_policy)
wt_system_run = unicode(wt_namespace) + u':' + unicode(wt_script_policy) + u'/Run'
wt_page = Page(self.site(), wt_system_run)
wt_system_run_text = wt_page.get(get_redirect = True)
if 'yes' not in wt_user_run_text or 'yes' not in wt_system_run_text:
raise Error('Bot stopped by /Run page on %s -- %s = %s , %s = %s'
% (wt_lang, wt_user_run, wt_user_run_text, wt_system_run, wt_system_run_text))
[edit] replaceLanguageLinks
To follow Wikitravel's manual of style, we need to add the four lines starting "# for Wikitravel's language links position..
# Is there any text in the 'after' part that means we should keep it after?
if "</noinclude>" in s2[firstafter:]:
if separatorstripped:
s = separator + s
newtext = s2[:firstafter].replace(marker,'') + s + s2[firstafter:]
elif site.language() in site.family.categories_last:
cats = getCategoryLinks(s2, site = site)
s2 = removeCategoryLinksAndSeparator(s2.replace(marker,'',cseparatorstripped).strip(), site) + separator + s
newtext = replaceCategoryLinks(s2, cats, site=site, addOnly=True)
# for Wikitravel's language links position.
elif site.family.name == 'wikitravel':
s = separator + s + separator
newtext = s2[:firstafter].replace(marker,'') + s + s2[firstafter:]
else:
newtext = s2.replace(marker,'').strip() + separator + s
[edit] _getUserData
Wikitravel's MediaWiki does not output user page link at the top of pages, but adding it by "username.js" when pages loaded. So we cannot retrieve username from the element with id="pt-userpage" because of this, we need to add the four lines starting "# for Wikitravel's user page link." below to set username from config. If you do not so, you will be asked your password per one edit; each pages on each language versions.
m = userpageR.search(text)
if m:
self._isLoggedIn[index] = True
self._userName[index] = m.group('username')
else:
self._isLoggedIn[index] = False
# No idea what is the user name, and it isn't important
self._userName[index] = None
# for Wikitravel's user page link.
if self.family.name == 'wikitravel':
self._isLoggedIn[index] = True
self._userName[index] = config.usernames[self.family.name][self.lang].encode('utf-8')
# Check user groups, if possible (introduced in 1.10)
groupsR = re.compile(r'var wgUserGroups = \[\"(.+)\"\];')
m = groupsR.search(text)
[edit] mediawiki_message
- On Esperanto version, one system message used in the script is not found in PHP of eo:special:Allmessages.
- On Hindi version, two system messages used in the script are not found in PHP of hi:special:Allmessages.
So we need to add the nine lines starting "# for Wikitravel" below to set those system messages, before retrieving "value" by "key".
# for Wikitravel
if self.family.name == 'wikitravel':
# (eo:)
if self.lang == 'eo':
self._mediawiki_messages['readonly_lag'] = u'The database has been automatically locked while the slave database servers catch up to the master'
# (hi:)
if self.lang == 'hi':
self._mediawiki_messages['readonly'] = u'Database locked'
self._mediawiki_messages['readonly_lag'] = u'The database has been automatically locked while the slave database servers catch up to the master'
key = key.lower()
try:
value = self._mediawiki_messages[key]
return value
except KeyError:
raise KeyError("MediaWiki key '%s' does not exist on %s"
% (key, self))
[edit] wikitravel_family.py
[edit] langs and namespaces
You may need to add langs and namespaces when you get bot flags on other language versions, especially namespaces[2] and namespaces[4]; those are mandatory for /Run subpages check. You can retrieve namespace information from MediaWiki API. URLs below are examples.
- en: http://wikitravel.org/wiki/en/api.php?action=query&meta=siteinfo&siprop=general|namespaces|statistics
- ja: http://wikitravel.org/wiki/ja/api.php?action=query&meta=siteinfo&siprop=general|namespaces|statistics
- shared: http://wikitravel.org/wiki/shared/api.php?action=query&meta=siteinfo&siprop=general|namespaces|statistics
Also, you can copy my "wikitravel_family.py".
[edit] Other modification of wikitravel_family.py
Add below to the top of "wikitravel_family.py".
# -*- coding: utf-8 -*-
Add all language code and "wts" into "langs".
self.langs = {
'ar':'ar',
'ca':'ca',
'de':'de',
'en':'en',
'eo':'eo',
'es':'es',
'fi':'fi',
'fr':'fr',
'he':'he',
'hi':'hi',
'hu':'hu',
'it':'it',
'ja':'ja',
'nl':'nl',
'pl':'pl',
'pt':'pt',
'ro':'ro',
'ru':'ru',
'sv':'sv',
'zh':'zh',
'wts':'wts',
}
Add lines below, next to "languages_by_size" within "__init__" of "class Family". If you get bot flags on other language versions of Wikitravel above, you can also add several lines like this.
# for Wikitravel's /Run subpages check.
self.wt_script_policy = {
'_default': u'Script policy',
'en': u'Script policy',
'ja': u'スクリプトの基本方針',
}
Add lines below, next to "wt_script_policy" within "__init__" of "class Family" which is added by you as above.
# Interwiki sorting order for Wikitravel Shared (wts:)
self.alphabetic = [
'ar', 'ca', 'de', 'en', 'eo', 'es', 'fi', 'fr', 'he', 'hi',
'hu', 'it', 'ja', 'nl', 'pl', 'pt', 'ro', 'ru', 'sv', 'zh',
'wts'
]
# for Wikitravel Shared (wts:)
# Which languages have a special order for putting interlanguage links,
# and what order is it? If a language is not in interwiki_putfirst,
# alphabetical order on language code is used. For languages that are in
# interwiki_putfirst, interwiki_putfirst is checked first, and
# languages are put in the order given there. All other languages are put
# after those, in code-alphabetical order.
self.interwiki_putfirst = {
'ar': self.alphabetic,
'ca': self.alphabetic,
'de': self.alphabetic,
'en': self.alphabetic,
'eo': self.alphabetic,
'es': self.alphabetic,
'fi': self.alphabetic,
'fr': self.alphabetic,
'he': self.alphabetic,
'hi': self.alphabetic,
'hu': self.alphabetic,
'it': self.alphabetic,
'ja': self.alphabetic,
'nl': self.alphabetic,
'pl': self.alphabetic,
'pt': self.alphabetic,
'ro': self.alphabetic,
'ru': self.alphabetic,
'sv': self.alphabetic,
'zh': self.alphabetic,
}
# for Wikitravel Shared (wts:), Previous DOTM, etc.
# Allows crossnamespace interwiki linking.
# Lists the possible crossnamespaces combinations
# keys are originating NS
# values are dicts where:
# keys are the originating langcode, or _default
# values are dicts where:
# keys are the languages that can be linked to from the lang+ns, or _default
# values are a list of namespace numbers
self.crossnamespace[0] = {
'_default': {
'_default': [4],
'wts': [4, 14],
},
}
self.crossnamespace[4] = {
'_default': {
'_default': [0],
},
}
self.crossnamespace[14] = {
'wts': {
'_default': [0],
},
}
Add "if" statement into "scriptpath" to support Wikitravel Shared.
def scriptpath(self, code):
# for Wikitravel Shared (wts:)
##return '/wiki/%s' % code
if code == 'wts':
return '/wiki/shared'
else:
return '/wiki/%s' % code
Update version number of Wikitravel's MediaWiki at the bottom of "wikitravel_family.py".
def version(self, code):
# return "1.10.1"
return "1.11.2"

