The center for all Wikitravel images!

User:Tatatabot/Scripts/pywikipedia/modification

From Wikitravel Shared
Jump to: navigation, search

Information of this page is outdated. See setup for Wikitravel instead.


To align with Wikitravel's script policy, you need to modify several scripts of Python Wikipedia Robot Framework.

  • Modifications below are based on pywikipedia revision 6949.
  • A patch file for pywikipedia revision 7292 is here.


user-config.py[edit]

When you setup your bot for the first time, you need to creat a file "user-config.py" in "pywikipedia" directory.


mylang = 'en'
family = 'wikitravel'
usernames['wikitravel']['en'] = u'USERNAME OF YOUR BOT'
put_throttle = 60

If you get bot flags on more than one language version of Wikitravel, you can also add several other usernames lines like this.

family.py[edit]

Comment out meatball's line where code is setting "self.known_families" within "__init__" of "class Family".


            'mbtest':            'mbtest',
#            'meatball':          'meatball',
            'mediazilla':        'mediazilla',

Comment out world66's line where code is setting "self.known_families" within "__init__" of "class Family".


            'wookieepedia':     'wookieepedia',
#            'world66':          'world66',
            'wowwiki':          'wowwiki',

Add lines below, next to "languages_by_size" within "__init__" of "class Family".


        # for Wikitravel's /Run subpages check.
        self.wt_script_policy = []

wikipedia.py[edit]

_getEditPage[edit]

To escape from a peculiar error of Wikitravel's MediaWiki setting, add a condition below.


        # for Wikitravel's /Run subpages check.
        ## if not matchVersionTab:
        if not matchVersionTab and not self.site().family.name == 'wikitravel':

botMayEdit[edit]

Add below to the top of "botMayEdit".


        # for Wikitravel's /Run subpages check.
        wt_family_name = self.site().family.name
        if wt_family_name == 'wikitravel':
            wt_lang = self.site().lang
            wt_username = config.usernames[wt_family_name][wt_lang].encode('utf-8')
            wt_username = urllib.quote(wt_username)
            wt_namespace = self.site().family.namespaces[2][wt_lang].encode('utf-8')
            wt_namespace = urllib.quote(wt_namespace)
            wt_user_run = unicode(wt_namespace) + u':' + unicode(wt_username) + u'/Run'
            wt_page = Page(self.site(), wt_user_run)
            wt_user_run_text = wt_page.get(get_redirect = True)
            wt_namespace = self.site().family.namespaces[4][wt_lang].encode('utf-8')
            wt_namespace = urllib.quote(wt_namespace)
            wt_script_policy = self.site().family.wt_script_policy[wt_lang].encode('utf-8')
            wt_script_policy = urllib.quote(wt_script_policy)
            wt_system_run = unicode(wt_namespace) + u':' + unicode(wt_script_policy) + u'/Run'
            wt_page = Page(self.site(), wt_system_run)
            wt_system_run_text = wt_page.get(get_redirect = True)
            if 'yes' not in wt_user_run_text or 'yes' not in wt_system_run_text:
                raise Error('Bot stopped by /Run page on %s -- %s = %s , %s = %s'
                         % (wt_lang, wt_user_run, wt_user_run_text, wt_system_run, wt_system_run_text))

replaceLanguageLinks[edit]

To follow Wikitravel's manual of style, we need to add the four lines starting "# for Wikitravel's language links position..


            # Is there any text in the 'after' part that means we should keep it after?
            if "</noinclude>" in s2[firstafter:]:
                if separatorstripped:
                    s = separator + s
                newtext = s2[:firstafter].replace(marker,'') + s + s2[firstafter:]
            elif site.language() in site.family.categories_last:
                cats = getCategoryLinks(s2, site = site)
                s2 = removeCategoryLinksAndSeparator(s2.replace(marker,'',cseparatorstripped).strip(), site) + separator + s
                newtext = replaceCategoryLinks(s2, cats, site=site, addOnly=True)
            # for Wikitravel's language links position.
            elif site.family.name == 'wikitravel':
                s = separator + s + separator
                newtext = s2[:firstafter].replace(marker,'') + s + s2[firstafter:]
            else:
                newtext = s2.replace(marker,'').strip() + separator + s

_getUserData[edit]

Wikitravel's MediaWiki does not output user page link at the top of pages, but adding it by "username.js" when pages loaded. So we cannot retrieve username from the element with id="pt-userpage" because of this, we need to add the four lines starting "# for Wikitravel's user page link." below to set username from config. If you do not so, you will be asked your password per one edit; each pages on each language versions.


        m = userpageR.search(text)
        if m:
            self._isLoggedIn[index] = True
            self._userName[index] = m.group('username')
        else:
            self._isLoggedIn[index] = False
            # No idea what is the user name, and it isn't important
            self._userName[index] = None

        # for Wikitravel's user page link.
        if self.family.name == 'wikitravel':
            self._isLoggedIn[index] = True
            self._userName[index] = config.usernames[self.family.name][self.lang].encode('utf-8')

        # Check user groups, if possible (introduced in 1.10)
        groupsR = re.compile(r'var wgUserGroups = \[\"(.+)\"\];')
        m = groupsR.search(text)

mediawiki_message[edit]

So we need to add the nine lines starting "# for Wikitravel" below to set those system messages, before retrieving "value" by "key".


        # for Wikitravel
        if self.family.name == 'wikitravel':
            # (eo:)
            if self.lang == 'eo':
                self._mediawiki_messages['readonly_lag'] = u'The database has been automatically locked while the slave database servers catch up to the master'
            # (hi:)
            if self.lang == 'hi':
                self._mediawiki_messages['readonly'] = u'Database locked'
                self._mediawiki_messages['readonly_lag'] = u'The database has been automatically locked while the slave database servers catch up to the master'

        key = key.lower()
        try:
            value = self._mediawiki_messages[key]
            return value
        except KeyError:
            raise KeyError("MediaWiki key '%s' does not exist on %s"
                           % (key, self))

wikitravel_family.py[edit]

langs and namespaces[edit]

You may need to add langs and namespaces when you get bot flags on other language versions, especially namespaces[2] and namespaces[4]; those are mandatory for /Run subpages check. You can retrieve namespace information from MediaWiki API. URLs below are examples.

Also, you can copy my "wikitravel_family.py".

Other modification of wikitravel_family.py[edit]

Add below to the top of "wikitravel_family.py".


# -*- coding: utf-8  -*-

Add all language code and "wts" into "langs".


        self.langs = {
            'ar':'ar',
            'ca':'ca',
            'de':'de',
            'en':'en',
            'eo':'eo',
            'es':'es',
            'fi':'fi',
            'fr':'fr',
            'he':'he',
            'hi':'hi',
            'hu':'hu',
            'it':'it',
            'ja':'ja',
            'nl':'nl',
            'pl':'pl',
            'pt':'pt',
            'ro':'ro',
            'ru':'ru',
            'sv':'sv',
            'zh':'zh',
            'wts':'wts',
        }

Add lines below, next to "languages_by_size" within "__init__" of "class Family". If you get bot flags on other language versions of Wikitravel above, you can also add several lines like this.


        # for Wikitravel's /Run subpages check.

        self.wt_script_policy = {
            '_default': u'Script policy',
            'en': u'Script policy',
            'ja': u'スクリプトの基本方針',
        }

Add lines below, next to "wt_script_policy" within "__init__" of "class Family" which is added by you as above.


    # Interwiki sorting order for Wikitravel Shared (wts:)
        self.alphabetic = [
            'ar', 'ca', 'de', 'en', 'eo', 'es', 'fi', 'fr', 'he', 'hi',
            'hu', 'it', 'ja', 'nl', 'pl', 'pt', 'ro', 'ru', 'sv', 'zh',
            'wts'
        ]

    # for Wikitravel Shared (wts:)
        # Which languages have a special order for putting interlanguage links,
        # and what order is it? If a language is not in interwiki_putfirst,
        # alphabetical order on language code is used. For languages that are in
        # interwiki_putfirst, interwiki_putfirst is checked first, and
        # languages are put in the order given there. All other languages are put
        # after those, in code-alphabetical order.

        self.interwiki_putfirst = {
            'ar': self.alphabetic,
            'ca': self.alphabetic,
            'de': self.alphabetic,
            'en': self.alphabetic,
            'eo': self.alphabetic,
            'es': self.alphabetic,
            'fi': self.alphabetic,
            'fr': self.alphabetic,
            'he': self.alphabetic,
            'hi': self.alphabetic,
            'hu': self.alphabetic,
            'it': self.alphabetic,
            'ja': self.alphabetic,
            'nl': self.alphabetic,
            'pl': self.alphabetic,
            'pt': self.alphabetic,
            'ro': self.alphabetic,
            'ru': self.alphabetic,
            'sv': self.alphabetic,
            'zh': self.alphabetic,
        }

    # for Wikitravel Shared (wts:), Previous DOTM, etc.

        # Allows crossnamespace interwiki linking.
        # Lists the possible crossnamespaces combinations
        # keys are originating NS
        # values are dicts where:
        #   keys are the originating langcode, or _default
        #   values are dicts where:
        #       keys are the languages that can be linked to from the lang+ns, or _default
        #       values are a list of namespace numbers

        self.crossnamespace[0] = {
            '_default': {
                '_default': [4],
                'wts': [4, 14],
            },
        }

        self.crossnamespace[4] = {
            '_default': {
                '_default': [0],
            },
        }

        self.crossnamespace[14] = {
            'wts': {
                '_default': [0], 
            },
        }

Add "if" statement into "scriptpath" to support Wikitravel Shared.


    def scriptpath(self, code):
        # for Wikitravel Shared (wts:)
        ##return '/wiki/%s' % code
        if code == 'wts':
            return '/wiki/shared'
        else:
            return '/wiki/%s' % code

Update version number of Wikitravel's MediaWiki at the bottom of "wikitravel_family.py".


    def version(self, code):
#        return "1.10.1"
        return "1.11.2"