| |
- abstract_metadata(spec_json, meta_path)
- Abstract metadata information from a self-contained umbrella spec into a metadata database.
Args:
spec_json: a dict including the contents from a json file
meta_path: the path of the metadata database.
Returns:
If the umbrella spec is not complete, exit directly.
Otherwise, return None.
- add2db(item, source_dict, target_dict)
- Add the metadata information (source format checksum size) about item from source_dict (umbrella specification) to target_dict (metadata database).
The item can be identified through two mechanisms: checksum attribute or one source location, which is used when checksum is not applicable for this item.
If the item has been in the metadata database, do nothing; otherwise, add it, together with its metadata, into the metadata database.
Args:
item: the name of a dependency
source_dict: fragment of an Umbrella specification
target_dict: fragement of an Umbrella metadata database
Returns:
None
- add2spec(item, source_dict, target_dict)
- Abstract the metadata information (source format checksum size) from source_dict (metadata database) and add these information into target_dict (umbrella spec).
For any piece of metadata information, if it already exists in target_dict, do nothing; otherwise, add it into the umbrella spec.
Args:
item: the name of a dependency
source_dict: fragment of an Umbrella metadata database
target_dict: fragement of an Umbrella specficiation
Returns:
None
- attr_check(item, attr, check_len=0)
- Check and obtain the attr of an item.
Args:
item: an item from the metadata database
attr: an attribute
check_len: if set to 1, also check whether the length of the attr is > 0; if set to 0, ignore the length checking.
Returns:
If the attribute check is successful, directly return the attribute.
Otherwise, directly exit.
- cctools_download(sandbox_dir, meta_json, hardware_platform, linux_distro, action)
- Download cctools
Args:
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
meta_json: the json object including all the metadata of dependencies.
hardware_platform: the architecture of the required hardware platform (e.g., x86_64).
linux_distro: the linux distro. For Example: redhat6, centos6.
action: the action on the downloaded dependency. Options: none, unpack. "none" leaves the downloaded dependency at it is. "unpack" uncompresses the dependency.
Returns:
the path of the downloaded cctools in the umbrella local cache. For example: /tmp/umbrella_test/cache/d19376d92daa129ff736f75247b79ec8/cctools-4.9.0-redhat6-x86_64
- check_cvmfs_repo(repo_name)
- Check whether a cvmfs repo is installed on the host or not
Args:
repo_name: a cvmfs repo name. For example: "/cvmfs/cms.cern.ch".
Returns:
If the cvmfs repo is installed, returns the string including the mountpoint of cvmfs cms repo. For example: "/cvmfs/cms.cern.ch".
Otherwise, return an empty string.
- chroot_mount_bind(dir_dict, file_dict, sandbox_dir, need_separate_rootfs, hardware_platform, distro_name, distro_version)
- Create each target mountpoint under the cached os image directory through `mount --bind`.
Args:
dir_dict: a dict including all the directory mountpoints needed to be created inside the OS image.
file_dict: a dict including all the file mountpoints needed to be created inside the OS image.
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
need_separate_rootfs: whether a separate rootfs is needed to execute the user's command.
hardware_platform: the architecture of the required hardware platform (e.g., x86_64).
distro_name: the name of the required OS (e.g., redhat).
distro_version: the version of the required OS (e.g., 6.5).
Returns:
If no error happens, returns None.
Otherwise, directly exit.
- chroot_post_process(dir_dict, file_dict, sandbox_dir, need_separate_rootfs, hardware_platform, distro_name, distro_version)
- Remove all the created target mountpoints within the cached os image directory.
It is not necessary to change the mode of the output dir, because only the root user can use the chroot method.
Args:
dir_dict: a dict including all the directory mountpoints needed to be created inside the OS image.
file_dict: a dict including all the file mountpoints needed to be created inside the OS image.
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
need_separate_rootfs: whether a separate rootfs is needed to execute the user's command.
hardware_platform: the architecture of the required hardware platform (e.g., x86_64).
distro_name: the name of the required OS (e.g., redhat).
distro_version: the version of the required OS (e.g., 6.5).
Returns:
If no error happens, returns None.
Otherwise, directly exit.
- chrootize_user_cmd(user_cmd, cwd_setting)
- Modify the user's command when the sandbox_mode is chroot. This check should be done after `parrotize_user_cmd`.
The cases when this function should be called: sandbox_mode == chroot
Args:
user_cmd: the user's command.
cwd_setting: the current working directory for the execution of the user's command.
Returns:
the modified version of the user's cmd.
- cleanup(filelist, dirlist)
- Cleanup the temporary files and dirs created by umbrella
Args:
filelist: a list including file paths
dirlist: a list including dir paths
Returns:
None
- collect_software_bin(host_cctools_path, sw_mount_dict)
- Construct the path environment from the mountpoints of software dependencies.
Each softare meta has a bin subdir containing all its executables.
Args:
host_cctools_path: the path of cctools under the umbrella local cache.
sw_mount_dict: a dict only including all the software mounting items.
Returns:
extra_path: the paths which are extracted from sw_mount_dict and host_cctools_path, and needed to be added into PATH.
- compare_versions(v1, v2)
- Compare two versions, the format of version is: X.X.X
Args:
v1: a version.
v2: a version.
Returns:
0 if v1 == v2; 1 if v1 is newer than v2; -1 if v1 is older than v2.
- condor_process(spec_path, spec_json, spec_path_basename, meta_path, sandbox_dir, output_dir, input_list_origin, user_cmd, cwd_setting, condorlog_path, cvmfs_http_proxy)
- Process the specification when condor execution engine is chosen
Args:
spec_path: the absolute path of the specification.
spec_json: the json object including the specification.
spec_path_basename: the file name of the specification.
meta_path: the path of the json file including all the metadata information.
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
output_dir: the output directory.
input_list_origin: the list of input file paths.
user_cmd: the user's command.
cwd_setting: the current working directory for the execution of the user's command.
condorlog_path: the path of the umbrella log executed on the remote condor execution node.
cvmfs_http_proxy: HTTP_PROXY environmetn variable used to access CVMFS by Parrot
Returns:
If no errors happen, return None;
Otherwise, directly exit.
- construct_chroot_mount_dict(sandbox_dir, output_dir, input_dict, need_separate_rootfs, os_image_dir, mount_dict, host_cctools_path)
- Construct directory mount list and file mount list for chroot. chroot requires the target mountpoint must be created within the chroot jail.
Args:
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
output_dir: the output directory.
input_dict: the setting of input files specified by the --inputs option.
need_separate_rootfs: whether a separate rootfs is needed to execute the user's command.
os_image_dir: the path of the OS image inside the umbrella local cache.
mount_dict: a dict including each mounting item in the specification, whose key is the access path used by the user's task; whose value is the actual storage path.
host_cctools_path: the path of cctools under the umbrella local cache.
Returns:
a tuple includes the directory mount list and the file mount list
- construct_docker_volume(input_dict, mount_dict)
- Construct the docker volume parameters based on mount_dict.
Args:
input_dict: the setting of input files specified by the --inputs option.
mount_dict: a dict including each mounting item in the specification, whose key is the access path used by the user's task; whose value is the actual storage path.
Returns:
volume_paras: all the `-v` options for the docker command.
- construct_env(sandbox_dir, os_image_dir)
- Read env_list inside an OS image and save all the environment variables into a dictionary.
Args:
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
os_image_dir: the path of the OS image inside the umbrella local cache.
Returns:
env_dict: a dictionary which includes all the environment variables from env_list
- construct_mountfile_cvmfs_cms_siteconf(sandbox_dir, cvmfs_cms_siteconf_mountpoint)
- Create the mountfile if chroot and docker is used to execute a CMS application and the host machine does not have cvmfs installed.
Args:
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
cvmfs_cms_siteconf_mountpoint: a string in the format of '/cvmfs/cms.cern.ch/SITECONF/local <SITEINFO dir in the umbrella local cache>/local'
Returns:
the path of the mountfile.
- construct_mountfile_easy(sandbox_dir, input_dict, mount_dict, cvmfs_cms_siteconf_mountpoint)
- Create the mountfile if parrot is used to create a sandbox for the application and a separate rootfs is not needed.
The trick here is the adding sequence does matter. The latter-added items will be checked first during the execution.
Args:
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
mount_dict: all the mount items extracted from the specification file and possible implicit dependencies like cctools.
input_dict: the setting of input files specified by the --inputs option
cvmfs_cms_siteconf_mountpoint: a string in the format of '/cvmfs/cms.cern.ch/SITECONF/local <SITEINFO dir in the umbrella local cache>/local'
Returns:
the path of the mountfile.
- construct_mountfile_full(sandbox_dir, os_image_dir, mount_dict, input_dict, cvmfs_cms_siteconf_mountpoint)
- Create the mountfile if parrot is used to create a sandbox for the application and a separate rootfs is needed.
The trick here is the adding sequence does matter. The latter-added items will be checked first during the execution.
Args:
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
os_image_dir: the path of the OS image inside the umbrella local cache.
mount_dict: all the mount items extracted from the specification file and possible implicit dependencies like cctools.
input_dict: the setting of input files specified by the --inputs option
cvmfs_cms_siteconf_mountpoint: a string in the format of '/cvmfs/cms.cern.ch/SITECONF/local <SITEINFO dir in the umbrella local cache>/local'
Returns:
the path of the mountfile.
- create_docker_image(sandbox_dir, hardware_platform, distro_name, distro_version, meta)
- Create a docker image based on the cached os image directory.
Returns:
If the docker image is imported from the tarball successfully, returns None.
Otherwise, directly exit.
- create_fake_mount(os_image_dir, sandbox_dir, mount_list, path)
- For each ancestor dir B of path (including path iteself), check whether it exists in the rootfs and whether it exists in the mount_list and
whether it exists in the fake_mount directory inside the sandbox.
If B is inside the rootfs or the fake_mount dir, do nothing. Otherwise, create a fake directory inside the fake_mount.
Reason: the reason why we need to guarantee any ancestor dir of a path exists somehow is that `cd` shell builtin does a syscall stat on each level of
the ancestor dir of a path. Without creating the mountpoint for any ancestor dir, `cd` would fail.
Args:
os_image_dir: the path of the OS image inside the umbrella local cache.
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
mount_list: a list of mountpoints which already been inside the parrot mountlist file.
path: a dir path.
Returns:
mount_str: a string including the mount items which are needed to added into the parrot mount file.
- data_dependency_process(name, id, meta_json, sandbox_dir, action)
- Download a data dependency
Args:
name: the item name in the data section
id: the id attribute of the processed dependency
meta_json: the json object including all the metadata of dependencies.
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
action: the action on the downloaded dependency. Options: none, unpack. "none" leaves the downloaded dependency at it is. "unpack" uncompresses the dependency.
Returns:
dest: the path of the downloaded data dependency in the umbrella local cache.
- data_install(data_spec, meta_json, sandbox_dir, mount_dict, env_para_dict)
- Process data section of the specification.
At the beginning of the function, mount_dict only includes items for software and os dependencies. After this function is done, all the items for data dependencies will be added into mount_dict.
Args:
data_spec: the data section of the specification.
meta_json: the json object including all the metadata of dependencies.
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
mount_dict: a dict including each mounting item in the specification, whose key is the access path used by the user's task; whose value is the actual storage path.
env_para_dict: the environment variables which need to be set for the execution of the user's command.
Returns:
mount_dict: the modified mount_dict with all the new mountpoints for data dependencies.
env_para_dict: the environment variables which need to be set for the execution of the user's command.
- decide_instance_type(cpu_cores, memory_size, disk_size, instances)
- Compare the required hardware configurations with each instance type, and return the first matched instance type, return 'no' if no matched instance type exist.
We can rank each instance type in the future, so that in the case of multiple matches exit, the closest matched instance type is returned.
Args:
cpu_cores: the number of required cpus (e.g., 1).
memory_size: the memory size requirement (e.g., 2GB). Not case sensitive.
disk_size: the disk size requirement (e.g., 2GB). Not case sensitive.
instances: the instances section of the ec2 json file.
Returns:
If there is no matched instance type, return 'no'.
Otherwise, returns the first matched instance type.
- dependency_check(item)
- Check whether an executable exists or not.
Args:
item: the name of the executable to be found.
Returns:
If the executable can be found through $PATH, return 0;
Otherwise, return -1.
- dependency_download(url, checksum, checksum_tool, dest, format_remote_storage, action)
- Download a dependency from the url and verify its integrity.
Args:
url: the storage location of the dependency.
checksum: the checksum of the dependency.
checksum_tool: the tool used to calculate the checksum, such as md5sum.
dest: the destination of the dependency where the downloaded dependency will be put.
format_remote_storage: the file format of the dependency, such as .tgz.
action: the action on the downloaded dependency. Options: none, unpack. "none" leaves the downloaded dependency at it is. "unpack" uncompresses the dependency.
Returns:
If the url is a broken link or the integrity of the downloaded data is bad, directly exit.
Otherwise, return None.
- dependency_process(name, id, action, meta_json, sandbox_dir, sandbox_mode, user_cmd, cwd_setting, hardware_platform, host_linux_distro, linux_distro, cvmfs_http_proxy)
- Process each explicit and implicit dependency.
Args:
name: the item name in the software section
id: the id attribute of the processed dependency
action: the action on the downloaded dependency. Options: none, unpack. "none" leaves the downloaded dependency at it is. "unpack" uncompresses the dependency.
os_id: the id attribute of the required OS.
meta_json: the json object including all the metadata of dependencies.
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
sandbox_mode: the execution engine.
user_cmd: the user's command.
cwd_setting: the current working directory for the execution of the user's command.
hardware_platform: the architecture of the required hardware platform (e.g., x86_64).
host_linux_distro: the linux distro of the host machine. For Example: redhat6, centos6.
linux_distro: the linux distro of the required OS. For Example: redhat6, centos6.
cvmfs_http_proxy: HTTP_PROXY environmetn variable used to access CVMFS by Parrot
Returns:
is_cms_cvmfs_app: whether this is a cms app which will be delivered by cmvfs and the local machine has no cvmfs installed. 1 means this is a cms app. 0 means this is not.
cvmfs_cms_siteconf_mountpoint: a string in the format of '/cvmfs/cms.cern.ch/SITECONF/local <SITEINFO dir in the umbrella local cache>/local'
mount_value: the actual storage path of one dependency.
local_cvmfs: the mountpoint of the local-installed cvmfs (e.g., /cvmfs).
- dir_create(filepath)
- Check the validity and existence of a file path. Create the directory for it if necessary. If the file already exists, exit directly.
Args:
filepath: a file path
Returns:
Exit directly if any error happens.
Otherwise, returns None.
- ec2_process(spec_path, spec_json, meta_path, meta_json, ec2_path, ec2_json, ssh_key, ec2_key_pair, ec2_security_group, sandbox_dir, output_dir, sandbox_mode, input_list, input_list_origin, env_options, user_cmd, cwd_setting, ec2log_path, cvmfs_http_proxy)
- Args:
spec_path: the path of the specification.
spec_json: the json object including the specification.
meta_path: the path of the json file including all the metadata information.
meta_json: the json object including all the metadata of dependencies.
ec2_path: the path of the json file including all infomration about the ec2 AMIs and instance types.
ec2_json: the json object corresponding to ec2_path.
ssh_key: the name the private key file to use when connecting to an instance.
ec2_key_pair: the path of the key-pair to use when launching an instance.
ec2_security_group: the security group within which the EC2 instance should be run.
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
output_dir: the output directory.
sandbox_mode: the execution engine.
input_list: a list including all the absolute path of the input files on the local machine.
input_list_origin: the list of input file paths.
env_options: the original `--env` option.
user_cmd: the user's command.
cwd_setting: the current working directory for the execution of the user's command.
ec2log_path: the path of the umbrella log executed on the remote EC2 execution node.
cvmfs_http_proxy: HTTP_PROXY environmetn variable used to access CVMFS by Parrot
Returns:
If no errors happen, return None;
Otherwise, directly exit.
- env_check(sandbox_dir, sandbox_mode, hardware_platform, cpu_cores, memory_size, disk_size, kernel_name, kernel_version)
- Check the matching degree between the specification requirement and the host machine.
Currently check the following item: sandbox_mode, hardware platform, kernel, OS, disk, memory, cpu cores.
Other things needed to check: software, and data??
Args:
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
sandbox_mode: the execution engine.
hardware_platform: the architecture of the required hardware platform (e.g., x86_64).
cpu_cores: the number of required cpus (e.g., 1).
memory_size: the memory size requirement (e.g., 2GB). Not case sensitive.
disk_size: the disk size requirement (e.g., 2GB). Not case sensitive.
kernel_name: the name of the required OS kernel (e.g., linux). Not case sensitive.
kernel_version: the version of the required kernel (e.g., 2.6.18).
Returns:
host_linux_distro: the linux distro of the host machine. For Example: redhat6, centos6.
- env_parameter_init(hardware_spec, kernel_spec, os_spec)
- Set the environment parameters according to the specification file.
Args:
hardware_spec: the hardware section in the specification for the user's task.
kernel_spec: the kernel section in the specification for the user's task.
os_spec: the os section in the specification for the user's task.
Returns:
a tuple including the requirements for hardware, kernel and os.
- func_call(cmd)
- Execute a command and return the return code, stdout, stderr.
Args:
cmd: the command needs to execute using the subprocess module.
Returns:
a tuple including the return code, stdout, stderr.
- func_call_withenv(cmd, env_dict)
- Execute a command with a special setting of the environment variables and return the return code, stdout, stderr.
Args:
cmd: the command needs to execute using the subprocess module.
env_dict: the environment setting.
Returns:
a tuple including the return code, stdout, stderr.
- get_instance_id(image_id, instance_type, ec2_key_pair, ec2_security_group)
- Start one VM instance through Amazon EC2 command line interface and return the instance id.
Args:
image_id: the Amazon Image Identifier.
instance_type: the Amazon EC2 instance type used for the task.
ec2_key_pair: the path of the key-pair to use when launching an instance.
ec2_security_group: the security group within which the EC2 instance should be run.
Returns:
If no error happens, returns the id of the started instance.
Otherwise, directly exit.
- get_linker_path(hardware_platform, os_image_dir)
- Return the path of ld-linux.so within the downloaded os image dependency
Args:
hardware_platform: the architecture of the required hardware platform (e.g., x86_64).
os_image_dir: the path of the OS image inside the umbrella local cache.
Returns:
If the dynamic linker is found within the OS image, return its fullpath.
Otherwise, returns None.
- get_public_dns(instance_id)
- Get the public dns of one VM instance from Amazon EC2.
`ec2-run-instances` can not directly return the public dns of the instance, so this function is needed to check the result of `ec2-describe-instances` to obtain the public dns of the instance.
Args:
instance_id: the id of the VM instance.
Returns:
If no error happens, returns the public dns of the instance.
Otherwise, directly exit.
- git_dependency_download(repo_url, dest, git_branch, git_commit)
- Prepare a dependency from a git repository.
First check whether dest exist or not: if dest exists, then checkout to git_branch and git_commit;
otherwise, git clone url, and then checkout to git_branch and git_commit.
Args:
repo_url: the url of the remote git repository
dest: the local directory where the git repository will be cloned into
git_branch: the branch name of the git repository
git_commit: the commit id of the repository
Returns:
dest: the local directory where the git repository is
- git_dependency_parser(item, repo_url, sandbox_dir)
- Parse a git dependency
Args:
item: an item from the metadata database
repo_url: the url of the remote git repository
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
Returns:
dest: the path of the downloaded data dependency in the umbrella local cache.
- has_docker_image(hardware_platform, distro_name, distro_version)
- Check whether the required docker image exists on the local machine or not.
Args:
hardware_platform: the architecture of the required hardware platform (e.g., x86_64).
distro_name: the name of the required OS (e.g., redhat).
distro_version: the version of the required OS (e.g., 6.5).
Returns:
If the required docker image exists on the local machine, returns 'yes'.
Otherwise, returns 'no'.
- in_local_group()
- Judge whether the current user's group exists in /etc/group.
Returns:
If the current user's group exists in /etc/group, returns 'yes'.
Otherwise, returns 'no'.
- in_local_passwd()
- Judge whether the current user exists in /etc/passwd.
Returns:
If the current user is inside /etc/passwd, returns 'yes'.
Otherwise, returns 'no'.
- is_dir(path)
- Judge whether a path is directory or not.
If the path is a dir, directly return. Otherwise, exit directly.
Args:
path: a path
Returns:
None
- json2file(filepath, json_item)
- Write a json object into a file
Args:
filepath: a file path
json_item: a dict representing a json object
Returns:
None
- main()
- md5_cal(filename, block_size=1048576)
- Calculate the md5sum of a file
Args:
filename: the name of the file
block_size: the size of each block
Returns:
If the calculation fails for any reason, directly exit.
Otherwise, return the md5 value of the content of the file
- meta_search(meta_json, name, id=None)
- Search the metadata information of an dependency in the meta_json
First find all the items with the required name in meta_json.
Then find the right one whose id satisfied the requirement.
If no id parameter is problem, then the first matched one will be returned.
Args:
meta_json: the json object including all the metadata of dependencies.
name: the name of the dependency.
id: the id attribute of the dependency. Defaults to None.
Returns:
If one item is found in meta_json, return the item, which is a dictionary.
If no item satisfied the requirement on meta_json, directly exit.
- obtain_path(os_image_dir, sw_mount_dict)
- Get the path environment variable from envfile and add the mountpoints of software dependencies into it
the envfile here is named env_list under the OS image.
Args:
os_image_dir: the path of the OS image inside the umbrella local cache.
sw_mount_dict: a dict only including all the software mounting items.
Returns:
path_env: the new value for PATH.
- parrotize_user_cmd(user_cmd, sandbox_dir, cwd_setting, linux_distro, hardware_platform, meta_json, cvmfs_http_proxy)
- Modify the user's command into `parrot_run + the user's command`.
The cases when this function should be called: (1) sandbox_mode == parrot; (2) sandbox_mode != parrot and cvmfs is needed to deliver some dependencies not installed on the execution node.
Args:
user_cmd: the user's command.
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
cwd_setting: the current working directory for the execution of the user's command.
hardware_platform: the architecture of the required hardware platform (e.g., x86_64).
linux_distro: the linux distro. For Example: redhat6, centos6.
meta_json: the json object including all the metadata of dependencies.
cvmfs_http_proxy: HTTP_PROXY environmetn variable used to access CVMFS by Parrot
Returns:
the modified version of the user's cmd.
- prune_attr(dict_item, attr_list)
- Remove certain attributes from a dict.
If a specific ttribute does not exist, pass.
Args:
dict_item: a dict
attr_list: a list of attributes which will be removed from the dict.
Returns:
None
- prune_spec(json_object)
- Remove the metadata information from a json file (which represents an umbrella specification).
Note: the original json file will not be changed by this function.
Args:
json_object: a json file representing an umbrella specification
Returns:
temp_json: a new json file without metadata information
- remove_trailing_slashes(path)
- Remove the trailing slashes of a string
Args:
path: a path, which can be any string.
Returns:
path: the new path without any trailing slashes.
- separatize_spec(spec_json, meta_json, target_type)
- Given an umbrella specification and an umbrella metadata database, generate a self-contained umbrella specification or a metadata database only including the informationnecessary for the umbrella spec.
If the target_type is spec, then generate a self-contained umbrella specification.
If the target_type is db, then generate a metadata database only including the information necessary for the umbrella spec.
Args:
spec_json: the json object including the specification.
meta_json: the json object including all the metadata of dependencies.
target_type: the type of the target json file, which can be an umbrella spec or an umbrella metadata db.
Returns:
metadata: a json object
- set_cvmfs_cms_siteconf(name, action, meta_json, sandbox_dir)
- Download cvmfs SITEINFO and set its mountpoint.
Args:
name: the name of the cmvfs SITEINFO meta in meta_json.
action: the action on the downloaded dependency. Options: none, unpack. "none" leaves the downloaded dependency at it is. "unpack" uncompresses the dependency.
meta_json: the json object including all the metadata of dependencies.
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
Returns:
cvmfs_cms_siteconf_mountpoint: a string in the format of '/cvmfs/cms.cern.ch/SITECONF/local <SITEINFO dir in the umbrella local cache>/local'
- software_install(env_para_dict, os_id, software_spec, meta_json, sandbox_dir, sandbox_mode, user_cmd, cwd_setting, hardware_platform, host_linux_distro, linux_distro, distro_name, distro_version, need_separate_rootfs, cvmfs_http_proxy)
- Installation each software dependency specified in the software section of the specification.
If the application is a CMS app and the execution node does not have cvmfs installed, change `user_cmd` to `parrot_run ... user_cmd` and cvmfs_cms_siteconf_mountpoint.
Args:
env_para_dict: the environment variables which need to be set for the execution of the user's command.
os_id: the id attribute of the required OS.
software_spec: the software section of the specification
meta_json: the json object including all the metadata of dependencies.
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
sandbox_mode: the execution engine.
user_cmd: the user's command.
cwd_setting: the current working directory for the execution of the user's command.
hardware_platform: the architecture of the required hardware platform (e.g., x86_64).
host_linux_distro: the linux distro of the host machine. For Example: redhat6, centos6.
linux_distro: the linux distro of the required OS. For Example: redhat6, centos6.
distro_name: the name of the required OS (e.g., redhat).
distro_version: the version of the required OS (e.g., 6.5).
need_separate_rootfs: whether a separate rootfs is needed to execute the user's command.
cvmfs_http_proxy: HTTP_PROXY environmetn variable used to access CVMFS by Parrot
Returns:
host_cctools_path: the path of cctools under the umbrella local cache.
cvmfs_cms_siteconf_mountpoint: a string in the format of '/cvmfs/cms.cern.ch/SITECONF/local <SITEINFO dir in the umbrella local cache>/local'
mount_dict: a dict including each mounting item in the specification, whose key is the access path used by the user's task; whose value is the actual storage path.
env_para_dict: the environment variables which need to be set for the execution of the user's command.
- specification_process(spec_json, sandbox_dir, behavior, meta_json, sandbox_mode, output_dir, input_dict, env_para_dict, user_cmd, cwd_setting, cvmfs_http_proxy)
- Create the execution environment specified in the specification file and run the task on it.
Args:
spec_json: the json object including the specification.
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
behavior: the umbrella behavior, such as `run`.
meta_json: the json object including all the metadata of dependencies.
sandbox_mode: the execution engine.
output_dir: the output directory.
input_dict: the setting of input files specified by the --inputs option.
env_para_dict: the environment variables which need to be set for the execution of the user's command.
user_cmd: the user's command.
cwd_setting: the current working directory for the execution of the user's command.
cvmfs_http_proxy: HTTP_PROXY environmetn variable used to access CVMFS by Parrot
Returns:
None.
- subprocess_error(cmd, rc, stdout, stderr)
- Print the command, return code, stdout, and stderr; and then directly exit.
Args:
cmd: the executed command.
rc: the return code.
stdout: the standard output of the command.
stderr: standard error of the command.
Returns:
directly exit the program.
- terminate_instance(instance_id)
- Terminate an instance.
Args:
instance_id: the id of the VM instance.
Returns:
None.
- transfer_env_para_docker(env_para_dict)
- Transfer the env_para_dict into the docker `-e` options.
Args:
env_para_dict: the environment variables which need to be set for the execution of the user's command.
Returns:
env_options: the docker `-e` options constructed from env_para_dict.
- url_download(url, dest)
- Download url into dest
Args:
url: the url needed to be downloaded.
dest: the path where the content from the url should be put.
Returns:
If the url is downloaded successfully, return None;
Otherwise, directly exit.
- validate_meta(meta_json)
- Validate a metadata db.
The current standard for a valid metadata db is: for each item, the "source" attribute must exist and not be not empty.
Args:
meta_json: a dict object representing a metadata db.
Returns:
If error happens, return directly with the error info.
Otherwise, None.
- validate_spec(spec_json, meta_json=None)
- Validate a spec_json.
Args:
spec_json: a dict object representing a specification.
meta_json: a dict object representing a metadata db.
Returns:
If error happens, return directly with the error info.
Otherwise, None.
- verify_kernel(host_kernel_name, host_kernel_version, kernel_name, kernel_version)
- Check whether the kernel version of the host machine matches the requirement.
The kernel_version format supported for now includes: >=2.6.18; [2.6.18, 2.6.32].
Args:
host_kernel_name: the name of the OS kernel of the host machine.
host_kernel_version: the version of the kernel of the host machine.
kernel_name: the name of the required OS kernel (e.g., linux). Not case sensitive.
kernel_version: the version of the required kernel (e.g., 2.6.18).
Returns:
If the kernel version of the host machine matches the requirement, return None.
If the kernel version of the host machine does not match the requirement, directly exit.
- which_exec(name)
- The implementation of shell which command
Args:
name: the name of the executable to be found.
Returns:
If the executable is found, returns its fullpath.
If PATH is not set, directly exit.
Otherwise, returns None.
- workflow_repeat(cwd_setting, sandbox_dir, sandbox_mode, output_dir, input_dict, env_para_dict, user_cmd, hardware_platform, host_linux_distro, distro_name, distro_version, need_separate_rootfs, os_image_dir, host_cctools_path, cvmfs_cms_siteconf_mountpoint, mount_dict, sw_mount_dict, meta_json)
- Run user's task with the help of the sandbox techniques, which currently inculde chroot, parrot, docker.
Args:
cwd_setting: the current working directory for the execution of the user's command.
sandbox_dir: the sandbox dir for temporary files like Parrot mountlist file.
sandbox_mode: the execution engine.
output_dir: the output directory.
input_dict: the setting of input files specified by the --inputs option.
env_para_dict: the environment variables which need to be set for the execution of the user's command.
user_cmd: the user's command.
hardware_platform: the architecture of the required hardware platform (e.g., x86_64).
distro_name: the name of the required OS (e.g., redhat).
distro_version: the version of the required OS (e.g., 6.5).
need_separate_rootfs: whether a separate rootfs is needed to execute the user's command.
os_image_dir: the path of the OS image inside the umbrella local cache.
host_cctools_path: the path of cctools under the umbrella local cache.
cvmfs_cms_siteconf_mountpoint: a string in the format of '/cvmfs/cms.cern.ch/SITECONF/local <SITEINFO dir in the umbrella local cache>/local'
mount_dict: a dict including each mounting item in the specification, whose key is the access path used by the user's task; whose value is the actual storage path.
sw_mount_dict: a dict only including all the software mounting items.
meta_json: the json object including all the metadata of dependencies.
Returns:
If no error happens, returns None.
Otherwise, directly exit.
|