python3-argparser的使用

引言

在LIBSVM和LIBLINEAR等软件包中，在基本的算法代码之上是软件包的接口，用户可以通过命令行来指定训练的超参数设置，比如在LIBSVM的训练代码中:

void exit_with_help() {
    printf(
        "Usage: svm-train [options] training_set_file [model_file]\n"
        "options:\n"
        "-s svm_type : set type of SVM (default 0)\n"
        "	0 -- C-SVC		(multi-class classification)\n"
        "	1 -- nu-SVC		(multi-class classification)\n"
        "	2 -- one-class SVM\n"
        "	3 -- epsilon-SVR	(regression)\n"
        "	4 -- nu-SVR		(regression)\n"
        "-t kernel_type : set type of kernel function (default 2)\n"
        "	0 -- linear: u'*v\n"
        "	1 -- polynomial: (gamma*u'*v + coef0)^degree\n"
        "	2 -- radial basis function: exp(-gamma*|u-v|^2)\n"
        "	3 -- sigmoid: tanh(gamma*u'*v + coef0)\n"
        "	4 -- precomputed kernel (kernel values in training_set_file)\n"
        "-d degree : set degree in kernel function (default 3)\n"
        "-g gamma : set gamma in kernel function (default 1/num_features)\n"
        "-r coef0 : set coef0 in kernel function (default 0)\n"
        "-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR "
        "(default 1)\n"
        "-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR "
        "(default 0.5)\n"
        "-p epsilon : set the epsilon in loss function of epsilon-SVR (default "
        "0.1)\n"
        "-m cachesize : set cache memory size in MB (default 100)\n"
        "-e epsilon : set tolerance of termination criterion (default 0.001)\n"
        "-h shrinking : whether to use the shrinking heuristics, 0 or 1 "
        "(default 1)\n"
        "-b probability_estimates : whether to train a SVC or SVR model for "
        "probability estimates, 0 or 1 (default 0)\n"
        "-wi weight : set the parameter C of class i to weight*C, for C-SVC "
        "(default 1)\n"
        "-v n: n-fold cross validation mode\n"
        "-q : quiet mode (no outputs)\n");
    exit(1);
}

这样的命令行接口对训练的参数设置做了详细的约定。恰好笔者这段时间在设计一个简单的Python神经网络框架，便打算实现这样一个命令行接口，通过指定网络超参数、优化算法、学习率和数据集等信息，在命令行中完成对神经网络的训练和预测等功能，而不需要从零开始编写代码去实现。

命令行参数

运行Python脚本的命令:

$ python test.py

如果我们想从命令行中对程序进行指定，比如test.py中是一个函数，我们想在命令行中输入，然后让python脚本输出结果，也就是

$ python test.py 4 # 4就是函数的参数

这样我们就不需要在Python脚本内编写input语句。现在的问题是如何在Python脚本中获取到命令行参数，我们可以借助sys.argv实现:

# test.py
from sys import argv

if __name__ == "__main__":
    print(argv)

运行

$ python test.py
['.\\test.py']

如果运行

$ python test.py 1
['.\\test.py', '1']

发现文件名本身就是一个参数了，站在python.exe的角度想，不同的文件名作参数会有不同的结果，因此文件名确实是一个变量。通过下面的程序

# test.py
from sys import argv

def f(x):
    return x**2

if __name__ == "__main__":
    python_arg = argv[1:]
    if len(python_arg) == 0:
        print("no parameter input")
    else:
        print([f(eval(x)) for x in python_arg])

输入命令行

$ python test.py 1 2 3
[1, 4, 9]

这样我们就可以得到输入参数1、2、3的平方，值得注意的是输入参数统一是字符串类型。

但我们离LIBSVM那样的命令行接口还有很大的差距，比如在LIBSVM中

$ ./train -s 0

程序会知道-s是一个可选参数，其值为0。但对于Python中的argv来说，它会将-s与0一视同仁，都作为输入参数。这是否意味着我们还需要编写大量规则来进行这样的参数识别？

Python的命令行接口——argparse

argparse 模块可以让人轻松编写用户友好的命令行接口。程序定义它需要的参数，然后 argparse 将弄清如何从 sys.argv 解析出那些参数。 argparse 模块还会自动生成帮助和使用手册，并在用户给程序传入无效参数时报出错误信息。

我们从最基本的例子开始

# test.py
import argparse
parser = argparse.ArgumentParser()
parser.parse_args()

这里定义了一个参数解析器，然后开启处理参数的功能。程序运行情况如下

$ python test.py
$ python test.py -h
usage: test.py [-h]

optional arguments:
  -h, --help  show this help message and exit
$ python test.py x
usage: test.py [-h]
test.py: error: unrecognized arguments: x

我们对上面的现象进行解释:

除了文件名之外无参数，不需要处理，因此无输出；
“-h”/”–help”是argparse自带的参数，可以得到命令行的帮助信息；
现在的parser只支持-h和–help，如果是其他的参数，程序会报错，同时输出帮助信息.

位置参数

考虑下面的例子

# test.py
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("param")
args = parser.parse_args()
print(args, args.param)

运行程序

$ python test.py
usage: test.py [-h] param
test.py: error: the following arguments are required: param
$ python test.py -h
usage: test.py [-h] param

positional arguments:
  param

optional arguments:
  -h, --help  show this help message and exit
$ python test.py foo
Namespace(param='foo') foo
$ python test.py foo bar
usage: test.py [-h] param
test.py: error: unrecognized arguments: bar

如何理解上面几个运行结果的逻辑？通过输出帮助信息，我们可以知道param是位置参数，而-h/--help是可选参数，param参数必须存在，除非参数中含有-h/--help。如果我们在param这个位置参数后面多输入一个参数，parser则会保存，因为它不认识这个参数。

我们可以为参数添加帮助信息:

# parser.add_argument("param")
parser.add_argument("param", help="a positional argument")

这样，当我们获取帮助信息时，程序会输出:

usage: test.py [-h] param

positional arguments:
  param       a positional argument

optional arguments:
  -h, --help  show this help message and exit

我们可以通过下面的操作，获取到指定类型的参数:

parser.add_argument(
    "param", 
    help="a positional argument",
    type=int,
)

比如这样就可以让param位置对应的参数为整型。

可选参数

研究玩位置参数，我们来研究可选参数(optional arguments)，显然-h/--help就是可选参数。我们可以像下面这样添加可选参数:

# test.py
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--param", help="a optional argument")
args = parser.parse_args()
print(args, args.param)

测试:

$ python test.py
Namespace(param=None) None
$ python test.py -h
usage: test.py [-h] [--param PARAM]

optional arguments:
  -h, --help     show this help message and exit
  --param PARAM  a optional argument
$ python test.py --param x
Namespace(param='x') x
$ python test.py x
usage: test.py [-h] [--param PARAM]
test.py: error: unrecognized arguments: x
$ python test.py --param
usage: test.py [-h] [--param PARAM]
test.py: error: argument --param: expected one argument

如果只存在可选参数，那么不添加参数显然可行；
如果想要传递可选参数，则必须注明参数名，后面接着写参数值，如果直接写参数值，parser会将它视作固定参数。在我们这个例子中，由于没有固定参数，所以不写参数名会直接报错退出；
如果只写参数名，不写参数值也会报错。

我们发现parser自带的-h/--help参数是不需要参数值的，类似的情况还有很多:对于程序而言，可选参数的值至于两个有实际意义:True和False。我们可以像下面这样修改以适应需要:

# test.py
import argparse
parser = argparse.ArgumentParser()
parser.add_argument(
    "--param", 
    help="a optional argument",
    action="store_true",
)
args = parser.parse_args()
print(args, args.param)

进行测试:

$ python test.py
Namespace(param=False) False
$ python test.py -h
usage: test.py [-h] [--param]

optional arguments:
  -h, --help  show this help message and exit
  --param     a optional argument
$ python test.py --param
Namespace(param=True) True
$ python test.py --param x
usage: test.py [-h] [--param]
test.py: error: unrecognized arguments: x

通过修改，现在程序的参数规则为：只要有--param参数，对应的参数值就为True，否则为False；
在帮助信息中，原来参数的形式为--param PARAM，意思在--param之后的那个值就是参数的值；而现在只有--param，也就是不需要在后面加参数值了；
尝试在--param的后面加上参数值，发现parser会将其识别为位置参数.

短选项

可以看到--help参数项有一个等价的简便形式:-h，这被称作短选项。我们也可以为自己设计的参数增加短选项:

parser.add_argument(
    "-p",
    "--param", 
    help="a optional argument",
    action="store_true",
)

测试:

$ python test.py -h
usage: test.py [-h] [-p]

optional arguments:
  -h, --help   show this help message and exit
  -p, --param  a optional argument
$ python test.py -p
Namespace(param=True) True

可以发现短选项与原参数等价。

限制参数范围

上面提到的LIBSVM命令行接口函数中，对于-s参数，它只接受0~4这5种取值，否则应该报错。类似的，argparse也有相应机制:

# test.py
import argparse
parser = argparse.ArgumentParser()
parser.add_argument(
    "-p",
    "--param",
    type=int,
    help="a optional argument",
    choices=[0, 1, 2],
)
args = parser.parse_args()
print(args, args.param)

测试:

$ python test.py -h
usage: test.py [-h] [-p {0,1,2}]

optional arguments:
  -h, --help            show this help message and exit
  -p {0,1,2}, --param {0,1,2}
                        a optional argument
$ python test.py -p 1
Namespace(param=1) 1
$ python test.py -p 3
usage: test.py [-h] [-p {0,1,2}]
test.py: error: argument -p/--param: invalid choice: 3 (choose from 0, 1, 2)

注意到限制参数范围后，parser对于不合法的输入会报错。

简单应用

在https://github.com/Kaslanarian/PyNet中，我们设计了一个神经网络框架，我们想设计一个基于命令行的单隐层训练模式。命令行处理的程序如下(train.py):

import argparse

if __name__ == "__main__":
    parser = argparse.ArgumentParser(
        # 处理格式问题
        formatter_class=argparse.RawTextHelpFormatter,
    )

    # 可选参数:隐层的神经元数目，默认是输入层神经元数目+输出层神经元数目
    parser.add_argument(
        "-n",
        help="number of neurons in hidden layer, (default n_input+n_output)",
        type=int,
    )
    # 可选参数:隐层的激活函数，默认是ReLU函数，限制只能选择0~5
    parser.add_argument(
        "-ha",
        help='''activation function of hidden layer(default 0)
        0 --- ReLU
        1 --- elu
        2 --- leaky relu with alpha=0.1
        3 --- sigmoid
        4 --- tanh
        5 --- softmax
        ''',
        type=int,
        choices=[0, 1, 2, 3, 4, 5],
    )
    # 可选参数:输出层的激活函数，默认是softmax函数，限制只能选择0~5
    parser.add_argument(
        "-oa",
        help='''activation function of output layer(default 5)
        0 --- ReLU
        1 --- elu
        2 --- leaky relu with alpha=0.1
        3 --- sigmoid
        4 --- tanh
        5 --- softmax
        ''',
        type=int,
        choices=[0, 1, 2, 3, 4, 5],
    )
    # 可选参数:损失函数，默认是交叉熵函数，限制只能选0和1
    parser.add_argument(
        "-l",
        help='''loss function of network(default 0)
        0 --- cross entropy loss
        1 --- mean square loss
        ''',
        type=int,
        choices=[0, 1],
    )
    # 可选参数:是否进行He初始化，只关心True还是FALSE
    parser.add_argument(
        "-he",
        help="whether use He initialization",
        action="store_true",
    )
    # 可选参数:是否进行Xavier初始化，只关心True还是FALSE
    parser.add_argument(
        "-xavier",
        help="whether use Xavier initialization",
        action="store_true",
    )
    # 可选参数:选择正则化方法，限制在0~2
    parser.add_argument(
        "-reg",
        help='''whether use regularization(default 0)
        0 --- No regularize
        1 --- L1 regularize
        2 --- L2 regularize
        ''',
        type=int,
        choices=[0, 1, 2],
    )
    # 可选参数:正则化系数
    parser.add_argument(
        "-coef",
        help="regularized term coefficient",
        type=float,
    )
    # 可选参数，是否进行固定初始化，只关心True or False
    parser.add_argument(
        "-fix",
        help="Use fixed data to initialize net",
        action="store_true",
    )
    
    # 可选参数:训练轮数
    parser.add_argument("-e", help="training epoch(default 10)", type=int)
    # 可选参数:训练batch_size
    parser.add_argument("-b", help="batch size(default 32)", type=int)
    # 可选参数:训练的优化方法
    parser.add_argument(
        "-opt",
        help='''choose net optimizer(default 0)
        0 --- BGD
        1 --- Momentum
        2 --- Adagrad
        3 --- Adadetla
        4 --- Adam
        ''',
        type=int,
        choices=[0, 1, 2, 3, 4],
    )

    # 学习率
    parser.add_argument("-lr", help="learning rate(default=0.1)", type=float)
    # 训练数据集，位置参数，必须要有
    parser.add_argument("data", help="path of training data")
    # 是否对训练数据标准化
    parser.add_argument(
        "-std",
        help="whether standard scale training data",
        action="store_true",
    )
    # 可选参数:训练完成后保存的模型文件名，默认是{数据集文件名}.model
    parser.add_argument("-name",
                        help="filename of model, default {data}.model")

    args = parser.parse_args()

输出的帮助信息是这样的:

$ python train.py -h
usage: train.py [-h] [-n N] [-ha {0,1,2,3,4,5}] [-oa {0,1,2,3,4,5}] [-l {0,1}] [-he] [-xavier] [-reg {0,1,2}] [-coef COEF] [-fix] [-e E] [-b B]
                [-opt {0,1,2,3,4}] [-lr LR] [-std] [-name NAME]
                data

positional arguments:
  data               path of training data

optional arguments:
  -h, --help         show this help message and exit
  -n N               number of neurons in hidden layer, (default n_input+n_output)
  -ha {0,1,2,3,4,5}  activation function of hidden layer(default 0)
                             0 --- ReLU
                             1 --- elu
                             2 --- leaky relu with alpha=0.1
                             3 --- sigmoid
                             4 --- tanh
                             5 --- softmax

  -oa {0,1,2,3,4,5}  activation function of output layer(default 5)
                             0 --- ReLU
                             1 --- elu
                             2 --- leaky relu with alpha=0.1
                             3 --- sigmoid
                             4 --- tanh
                             5 --- softmax

  -l {0,1}           loss function of network(default 0)
                             0 --- cross entropy loss
                             1 --- mean square loss

  -he                whether use He initialization
  -xavier            whether use Xavier initialization
  -reg {0,1,2}       whether use regularization(default 0)
                             0 --- No regularize
                             1 --- L1 regularize
                             2 --- L2 regularize

  -coef COEF         regularized term coefficient
  -fix               Use fixed data to initialize net
  -e E               training epoch(default 10)
  -b B               batch size(default 32)
  -opt {0,1,2,3,4}   choose net optimizer(default 0)
                             0 --- BGD
                             1 --- Momentum
                             2 --- Adagrad
                             3 --- Adadetla
                             4 --- Adam

  -lr LR             learning rate(default=0.1)
  -std               whether standard scale training data
  -name NAME         filename of model, default {data}.model

尽管我们并没有对argparse进行深入学习，我们已经能够编写比较方便的命令行接口。

编写命令行接口

引言