Jump to content

User:GreenC/software/iabget

From Wikipedia, the free encyclopedia

iabget is a CLI program to the IABot API.

Requirements

[edit]
  • GNU Awk 4.1+ (standard POSIX tool)
  • PHP 7+ (possibly earlier ok)
  • jq 1.5+ (possibly earlier ok)
  • GNU timeout and wget

Setup

[edit]
  • Copy iabget.awk to a directory.
  • chmod 750 iabget.awk
  • Add a symlink ie. ln -s iabget.awk iabget
  • Edit iabget and configure
    • Set hash-bang line to location of GNU awk
    • Edit the "SETUP" section
      • Set paths to programs
      • The "bell" is optional and can be ignored/blanked
      • Follow directions for obtaining Oauth keys
      • Create custom agent string
      • Location of cookie file
  • Bot Permissions are required in the IABot tool. Request from an IABOt admin.

Usage

[edit]
 iabget - InternetArchiveBot API command-line interface

           -a     <action>         Action (see API doc): 
                                     searchpagefromurl
                                     searchurlfrompage
                                     searchurldata
                                     modifyurl
                                     submitbotjob (use with -f not -p)
                                     getbotjob
                                     getbotqueue
                                     logout
           -p     <parameter>      Parameter=value string (see API doc).
                                     Separating & between paramters should be {&} to disambig from & in URLs
                                     eg. -p "urlid=55{&}archiveurl=http://.."
           -f     <filename>       Filename containing the postdata encoded; or article encoded
           -l     <language>       Wiki language code (default: en). Valid is "wikidata"
           -d     <level>          Turn on debugging. 1 = level1 2 = level2
           -e                      Send error msgs to stdout (default: stderr)
           -w                      Show raw JSON
           -h                      help

  Examples:
     iabget -a getbotqueue -p displayqueued -w
     iabget -a getbotjob -p id=200 -w
     iabget -a searchurlfrompage -p pageids=4589

Notes

[edit]
  • -a is the name of the API command to run. -p are the parameter options. They are described in the API docs. The action and parameter values are passed as-is through to the API.
  • When creating the -p string, some of the parameter values may contain a & organically such as URLs and titles. Thus the program uses {&} to separate values. For example instead of -p "urlid=55&archiveurl=http://example.com/inx&path" it is -p "urlid=55{&}archiveurl=http://example.com/inx&path"
  • The -w is recommended as it will return the JSON response from IABot. An example success response:
{
    "result": "success",
    "revid": 1266710138,
    "errors": null,
    "loggedon": true,
    "username": "GreenC bot",
    "checksum": "3fb5106f3f55e8bb225e5cfd5e270c1f",
    "csrf": "e97c4fdf34bc05f2faa489089e961c12",
    "servetime": 0.5149
}

If "result" is not "success" something went wrong due to timeouts, load etc.. the calling application can sleep and try again a few times before giving up.