"Wikitravel has a speed and convenience the books' publishers can only envy." Time Europe

User:Tatatabot/Scripts/pywikipedia/modification

From Wikitravel Shared

Jump to: navigation, search

Information of this page is outdated. See setup for Wikitravel instead.

To align with Wikitravel's script policy, you need to modify several scripts of Python Wikipedia Robot Framework.

  • Modifications below are based on pywikipedia revision 6949.
  • A patch file for pywikipedia revision 7292 is here.


[edit] user-config.py

When you setup your bot for the first time, you need to creat a file "user-config.py" in "pywikipedia" directory.


mylang = 'en'
family = 'wikitravel'
usernames['wikitravel']['en'] = u'USERNAME OF YOUR BOT'
put_throttle = 60

If you get bot flags on more than one language version of Wikitravel, you can also add several other usernames lines like this.

[edit] family.py

Comment out meatball's line where code is setting "self.known_families" within "__init__" of "class Family".


            'mbtest':            'mbtest',
#            'meatball':          'meatball',
            'mediazilla':        'mediazilla',

Comment out world66's line where code is setting "self.known_families" within "__init__" of "class Family".


            'wookieepedia':     'wookieepedia',
#            'world66':          'world66',
            'wowwiki':          'wowwiki',

Add lines below, next to "languages_by_size" within "__init__" of "class Family".


        # for Wikitravel's /Run subpages check.
        self.wt_script_policy = []

[edit] wikipedia.py

[edit] _getEditPage

To escape from a peculiar error of Wikitravel's MediaWiki setting, add a condition below.


        # for Wikitravel's /Run subpages check.
        ## if not matchVersionTab:
        if not matchVersionTab and not self.site().family.name == 'wikitravel':

[edit] botMayEdit

Add below to the top of "botMayEdit".


        # for Wikitravel's /Run subpages check.
        wt_family_name = self.site().family.name
        if wt_family_name == 'wikitravel':
            wt_lang = self.site().lang
            wt_username = config.usernames[wt_family_name][wt_lang].encode('utf-8')
            wt_username = urllib.quote(wt_username)
            wt_namespace = self.site().family.namespaces[2][wt_lang].encode('utf-8')
            wt_namespace = urllib.quote(wt_namespace)
            wt_user_run = unicode(wt_namespace) + u':' + unicode(wt_username) + u'/Run'
            wt_page = Page(self.site(), wt_user_run)
            wt_user_run_text = wt_page.get(get_redirect = True)
            wt_namespace = self.site().family.namespaces[4][wt_lang].encode('utf-8')
            wt_namespace = urllib.quote(wt_namespace)
            wt_script_policy = self.site().family.wt_script_policy[wt_lang].encode('utf-8')
            wt_script_policy = urllib.quote(wt_script_policy)
            wt_system_run = unicode(wt_namespace) + u':' + unicode(wt_script_policy) + u'/Run'
            wt_page = Page(self.site(), wt_system_run)
            wt_system_run_text = wt_page.get(get_redirect = True)
            if 'yes' not in wt_user_run_text or 'yes' not in wt_system_run_text:
                raise Error('Bot stopped by /Run page on %s -- %s = %s , %s = %s'
                         % (wt_lang, wt_user_run, wt_user_run_text, wt_system_run, wt_system_run_text))

[edit] replaceLanguageLinks

To follow Wikitravel's manual of style, we need to add the four lines starting "# for Wikitravel's language links position..


            # Is there any text in the 'after' part that means we should keep it after?
            if "</noinclude>" in s2[firstafter:]:
                if separatorstripped:
                    s = separator + s
                newtext = s2[:firstafter].replace(marker,'') + s + s2[firstafter:]
            elif site.language() in site.family.categories_last:
                cats = getCategoryLinks(s2, site = site)
                s2 = removeCategoryLinksAndSeparator(s2.replace(marker,'',cseparatorstripped).strip(), site) + separator + s
                newtext = replaceCategoryLinks(s2, cats, site=site, addOnly=True)
            # for Wikitravel's language links position.
            elif site.family.name == 'wikitravel':
                s = separator + s + separator
                newtext = s2[:firstafter].replace(marker,'') + s + s2[firstafter:]
            else:
                newtext = s2.replace(marker,'').strip() + separator + s

[edit] _getUserData

Wikitravel's MediaWiki does not output user page link at the top of pages, but adding it by "username.js" when pages loaded. So we cannot retrieve username from the element with id="pt-userpage" because of this, we need to add the four lines starting "# for Wikitravel's user page link." below to set username from config. If you do not so, you will be asked your password per one edit; each pages on each language versions.


        m = userpageR.search(text)
        if m:
            self._isLoggedIn[index] = True
            self._userName[index] = m.group('username')
        else:
            self._isLoggedIn[index] = False
            # No idea what is the user name, and it isn't important
            self._userName[index] = None

        # for Wikitravel's user page link.
        if self.family.name == 'wikitravel':
            self._isLoggedIn[index] = True
            self._userName[index] = config.usernames[self.family.name][self.lang].encode('utf-8')

        # Check user groups, if possible (introduced in 1.10)
        groupsR = re.compile(r'var wgUserGroups = \[\"(.+)\"\];')
        m = groupsR.search(text)

[edit] mediawiki_message

So we need to add the nine lines starting "# for Wikitravel" below to set those system messages, before retrieving "value" by "key".


        # for Wikitravel
        if self.family.name == 'wikitravel':
            # (eo:)
            if self.lang == 'eo':
                self._mediawiki_messages['readonly_lag'] = u'The database has been automatically locked while the slave database servers catch up to the master'
            # (hi:)
            if self.lang == 'hi':
                self._mediawiki_messages['readonly'] = u'Database locked'
                self._mediawiki_messages['readonly_lag'] = u'The database has been automatically locked while the slave database servers catch up to the master'

        key = key.lower()
        try:
            value = self._mediawiki_messages[key]
            return value
        except KeyError:
            raise KeyError("MediaWiki key '%s' does not exist on %s"
                           % (key, self))

[edit] wikitravel_family.py

[edit] langs and namespaces

You may need to add langs and namespaces when you get bot flags on other language versions, especially namespaces[2] and namespaces[4]; those are mandatory for /Run subpages check. You can retrieve namespace information from MediaWiki API. URLs below are examples.

Also, you can copy my "wikitravel_family.py".

[edit] Other modification of wikitravel_family.py

Add below to the top of "wikitravel_family.py".


# -*- coding: utf-8  -*-

Add all language code and "wts" into "langs".


        self.langs = {
            'ar':'ar',
            'ca':'ca',
            'de':'de',
            'en':'en',
            'eo':'eo',
            'es':'es',
            'fi':'fi',
            'fr':'fr',
            'he':'he',
            'hi':'hi',
            'hu':'hu',
            'it':'it',
            'ja':'ja',
            'nl':'nl',
            'pl':'pl',
            'pt':'pt',
            'ro':'ro',
            'ru':'ru',
            'sv':'sv',
            'zh':'zh',
            'wts':'wts',
        }

Add lines below, next to "languages_by_size" within "__init__" of "class Family". If you get bot flags on other language versions of Wikitravel above, you can also add several lines like this.


        # for Wikitravel's /Run subpages check.

        self.wt_script_policy = {
            '_default': u'Script policy',
            'en': u'Script policy',
            'ja': u'スクリプトの基本方針',
        }

Add lines below, next to "wt_script_policy" within "__init__" of "class Family" which is added by you as above.


    # Interwiki sorting order for Wikitravel Shared (wts:)
        self.alphabetic = [
            'ar', 'ca', 'de', 'en', 'eo', 'es', 'fi', 'fr', 'he', 'hi',
            'hu', 'it', 'ja', 'nl', 'pl', 'pt', 'ro', 'ru', 'sv', 'zh',
            'wts'
        ]

    # for Wikitravel Shared (wts:)
        # Which languages have a special order for putting interlanguage links,
        # and what order is it? If a language is not in interwiki_putfirst,
        # alphabetical order on language code is used. For languages that are in
        # interwiki_putfirst, interwiki_putfirst is checked first, and
        # languages are put in the order given there. All other languages are put
        # after those, in code-alphabetical order.

        self.interwiki_putfirst = {
            'ar': self.alphabetic,
            'ca': self.alphabetic,
            'de': self.alphabetic,
            'en': self.alphabetic,
            'eo': self.alphabetic,
            'es': self.alphabetic,
            'fi': self.alphabetic,
            'fr': self.alphabetic,
            'he': self.alphabetic,
            'hi': self.alphabetic,
            'hu': self.alphabetic,
            'it': self.alphabetic,
            'ja': self.alphabetic,
            'nl': self.alphabetic,
            'pl': self.alphabetic,
            'pt': self.alphabetic,
            'ro': self.alphabetic,
            'ru': self.alphabetic,
            'sv': self.alphabetic,
            'zh': self.alphabetic,
        }

    # for Wikitravel Shared (wts:), Previous DOTM, etc.

        # Allows crossnamespace interwiki linking.
        # Lists the possible crossnamespaces combinations
        # keys are originating NS
        # values are dicts where:
        #   keys are the originating langcode, or _default
        #   values are dicts where:
        #       keys are the languages that can be linked to from the lang+ns, or _default
        #       values are a list of namespace numbers

        self.crossnamespace[0] = {
            '_default': {
                '_default': [4],
                'wts': [4, 14],
            },
        }

        self.crossnamespace[4] = {
            '_default': {
                '_default': [0],
            },
        }

        self.crossnamespace[14] = {
            'wts': {
                '_default': [0], 
            },
        }

Add "if" statement into "scriptpath" to support Wikitravel Shared.


    def scriptpath(self, code):
        # for Wikitravel Shared (wts:)
        ##return '/wiki/%s' % code
        if code == 'wts':
            return '/wiki/shared'
        else:
            return '/wiki/%s' % code

Update version number of Wikitravel's MediaWiki at the bottom of "wikitravel_family.py".


    def version(self, code):
#        return "1.10.1"
        return "1.11.2"