Scripting CCG Executables¶
Pygeostat can be used for scripting parameter files and using CCG/GSLIB software (executable files) to create advanced geostatistical workflows. This is done using Program class.
Program Class¶
-
class
pygeostat.programs.programs.
Program
(program=None, parstr=None, parfile='temp', getpar=None, nogetarg=False, defaultdict={})¶ Base class containing routines for running GSLIB programs
Get Parameter File¶
-
Program.
getparfile
(quiet=True)¶ Get the parfile from this program by copying to the clipboard or printing the parfile. This replaces the need to pre-execute a program to get the parfile, but relies on CCG programs being properly configured to generate the correct parfile upon first execution.
This function requires pyperclip. This is a dependency of pygeostat, but can be installed with:
> pip install pyperclip
Parameters: quiet (bool) – The function will copy the parameter file to the clipboard with the block quotes as default (quiet = True)
Run Program¶
-
Program.
run
(parstr=None, parfile=None, program=None, nogetarg=None, filehandle=None, logfile=None, testfilename=None, pardict=None, quiet=False, liveoutput=None, chdirpath=None)¶ Runs a GSLIB style program using the subprocess module and prints the output. On an error, the output is printed and an exception is raised
The only required parameters are program and parstr, all other parameters are optional.
Parameters: - parstr (str) – parameters, taken from self if None
- parfile (str or callable) – name of parameter file to create, or a callable function that returns a unique path/parfile.par string to use for this parfile.
- program (str) – name of GSLIB/CCG program to run, taken from self if None
- nogetarg – uses a pipe + communicate for the call instead of arguments
- filehandle – handle for file to write program output to
- logfile (str) – filename for a log file to write program output to. If file already exists it will overwrite the file. If filehandle is passed then logfile will be ignored.
- testfilename (list) – name(s) to check for availability prior to executing the program
- quiet – if quiet, don’t print to let the user know it is calling
- liveoutput (bool) – live update the ipython notebook or calling script with the output of the called program
- chdirpath (str) – Some programs have files they use which prevent running multiple instances of the same program from the same directory. This can be used to chang the path the program is called from to prevent conflicts like this (for example, with kt3d_lva)
Examples
An example for setting up a few key words and running histplt
>>> histpltpar = ''' Parameters for HISTPLT ... ********************** ... ... START OF PARAMETERS: ... {datafl} -file with data ... {varcol} 0 - columns for variable and weight ... -1.0 1.0e21 - trimming limits ... {outfl} -file for PostScript output ... 0.0 -20.0 -attribute minimum and maximum ... -1.0 -frequency maximum (<0 for automatic) ... 20 -number of classes ... 0 -0=arithmetic, 1=log scaling ... 0 -0=frequency, 1=cumulative histogram ... 0 - number of cum. quantiles (<0 for all) ... 3 -number of decimal places (<0 for auto.) ... {varname} -title ... 1.5 -positioning of stats (L to R: -1 to 1) ... -1.1e21 -reference value for box plot ... ''' >>> >>> histplt = gs.Program(program='histplt', parfile='histplt.par') >>> >>> histplt.run(parstr=histpltpar.format(datafl=datafl.flname, ... varcol=datafl.gscol('Bitumen'), ... varname='Bitumen', ... outfl='histplt_bitumen.ps'))
Write Parameter File¶
-
Program.
writepar
(parstr=None, parfile=None, pardict=None)¶ Writes out the parameter file without running the program, which can be helpful for checking
Parameters: - parstr (str) – This is the parameter file string initiated with the program.
- parfile (str) – file name or path to save the parameter file to.
- pardict (dict) – Dictionary for the variables in the parameter file.
Run Programs in Parallel¶
-
pygeostat.programs.program_utils.
runparallel
(gslibprogram, kwargslist, nprocess=None, mute=False, progressbar=False)¶ Run a set of gslib program calls in parallel
Parameters: - gslibprogram (Program) – name of GSLIB/CCG program to run
- kwargslist (list of dictionaries) – list of keyword arguments which will be used to call gslibprogram.run(kwarg)
- nprocess (int) – number of threads to spawn. Drawn from Parameters[‘config.nprocess’] if None.
Examples
Setting up the calling parameters. This example is based off the example used in
gs.Program()
>>> callpars = [] >>> # For each variable we want to run in parallel, assemble a dictionary of parameters and ... # append to callpars >>> for variable in ['Bitumen','Fines','Chlorides']: >>> # Establish the parameter file for this variable >>> mypars = {'datafl':datafl.flname, ... 'varcol':datafl.gscol(variable), ... 'varname':variable, ... 'outfl':'histplt_'+variable+'.ps'} >>> # Assemble the arguments for the GSLIB call and add the arguments to the list of calls >>> callpars.append({'parstr':histpltpar.format(**mypars), ... 'parfile':'histplt_'+variable+'.par', ... 'testfilename':datafl.flname})
Now run in parallel
>>> histplt = gs.Program(program='histplt', parfile='histplt.par') >>> gs.runparallel(histplt, callpars)
Misc Program Utilities¶
-
pygeostat.programs.program_utils.
parallel_function
(function, arglist=None, kwarglist=None, nprocess=None, returnvals=False, progressbar=False)¶ Quickly parallelize a function with a set of arguments or keyword arguments. If the function returns something (as oppose to writes out values to files), set returnvals=True to get the dictionary that can be used to collect the results.
Parameters: - function (func) – a callable function DEFINED IN A .py FILE. The function must be imported from a module since it has to be pickled to be parallelized. Defining the function in the jupyter notebook doesnt seem to work.
- arglist (list or tuples) – a list of tuple arguments to pass to the function, see examples
- kwarglist (list) – a list of keyword dictionaries to pass to the functions, i.e. [{‘arg1’: value, ‘arg2’: value}, {‘arg1’: value, ‘arg2’: value}, etc]
- nprocess (int) – the number of parallel processes to run. Drawn from Parameters[‘config.nprocess’] if None.
- returnvals (bool) – if the function returns something, collect it in a dictionary, you can use the .get() method of the parallel result to collect the required data.
Returns: optionally return a dictionary of parallel processing results
Return type: res (dict)
Usage:
For a function that takes a single argument, setup arglist in this way:
>>> arglist = [] >>> arglist.append((arg,))
OR:
>>> arglist = [(arg,) for arg in range(nparallel)]
Alternatively the argument tuple may take several arguments:
>>> arglist = [(args, for, function), (args, for, function), (args, for, function)]
Detailed usage:
>>> arglist = [] >>> for sr in series: ... arglist.append(('keyout.out', rbfpath + 'keyout%s.out' % sr, griddef, ... griddefs[sr], [3])) >>> gs.parallel_function(rm.changegrid, arglist=arglist)
OR:
>>> kwarglist = [] >>> for sr in series: ... kwarglist.append({'infl': 'keyout.out', ... 'outfl': rbfpath + 'keyout%s.out' % sr, ... 'ingrid': griddef, ... 'outgrid': griddefs[sr], ... 'avmethods': [3] ... }) >>> gs.parallel_function(rm.changegrid, kwarglist=kwarglist)
-
pygeostat.programs.program_utils.
rseed
(prng=None)¶ A random seed generator object, callable, to replace gs.rseed, to generate more unique random seeds
-
pygeostat.programs.program_utils.
rseed_list
(nseeds, seed=None)¶ Returns a list of ACORNI (GSLIB-suitable) random number seeds. A initial seed can be passed ensureing the same list of seeds is returned to a script that is rerun.
Parameters: - nseeds (int) – Number of seeds to return
- seed (int) – Initialization seed
Returns: List of random number seeds
Return type: seeds (list)
-
pygeostat.programs.program_utils.
parstr_kwargs
(parstr, fmt='pars')¶ Print a formatted list of kwargs found in the parfile. Tested and working for the {} style string formatting, found in the gamsim_ave parfile found below this function (for exampel). This is mostly used for being lazy and not writing out the kwargs you just entered into the parfile…. can also be helpful if you define the parfile elsewhere and you cant remember what kwargs you setup !
Parameters: - parstr (str) – the par string with the {} formatted parameters
- fmt (str) – the output format, permissible arguments are pars or dict
Examples
Get the formatted parfile
>>> parstr = ''' Parameters for GAMSIM_AVE >>> ************************* >>> START OF PARAMETERS: >>> {lithfl} -file with lithology information >>> {lithcol} {lithcode} - lithology column (0=not used), code >>> {datafl} -file with data >>> .... >>> '''
>>> print_parfile_kwargs(parstr) ... lithfl=, lithcol=, lithcode=, datafl=, ...
-
pygeostat.programs.program_utils.
dedent_parstr
(indented_parstr)¶ Remove leading indents on each line from a parameter file. This is automatically called by
Program.run()
so that parfiles may be tabbed to permit better structuring of python code.Examples
An un-tabbed parstr:
>>> parstr = ''' Parameters for GAMSIM_AVE >>> ************************* >>> START OF PARAMETERS: >>> {lithfl} -file with lithology information >>> {lithcol} {lithcode} - lithology column (0=not used), code >>> {datafl} -file with data >>> .... >>> '''
A tabbed parstr:
>>> parstr = ''' Parameters for GAMSIM_AVE >>> ************************* >>> START OF PARAMETERS: >>> {lithfl} -file with lithology information >>> {lithcol} {lithcode} - lithology column (0=not used), code >>> {datafl} -file with data >>> .... >>> '''