Class PegasusLite

  • All Implemented Interfaces:
    GridStart

    public class PegasusLite
    extends java.lang.Object
    implements GridStart
    This class launches all the jobs using Pegasus Lite a shell script based wrapper. The Pegasus Lite shell script for the compute jobs contains the commands to
     1) create directory on worker node
     2) fetch input data files
     3) execute the job
     4) transfer the output data files
     5) cleanup the directory
     
    The following property should be set to false to disable the staging of the SLS files via the first level staging jobs
     pegasus.transfer.stage.sls.file     false
     
    To enable this implementation at runtime set the following property
     pegasus.gridstart PegasusLite
     
    Version:
    $Revision$
    Author:
    Karan Vahi
    • Constructor Summary

      Constructors 
      Constructor Description
      PegasusLite()  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private void associateCredentials​(Job job, java.util.Collection<FileTransfer> files)
      Associates credentials with the job corresponding to the files that are being transferred.
      boolean canSetXBit()
      Indicates whether the enabling mechanism can set the X bit on the executable on the remote grid site, in addition to launching it on the remote grid stie
      private void complainForHeadNodeFileServer​(java.lang.String jobname, java.lang.String site)
      Complains for a missing head node file server on a site for a job
      private void construct​(Job job, java.lang.String key, java.lang.String value)
      Constructs a condor variable in the condor profile namespace associated with the job.
      protected java.lang.StringBuffer convertToTransferInputFormat​(java.util.Collection<FileTransfer> files)
      Convers the collection of files into an input format suitable for the transfer executable
      java.lang.String defaultPOSTScript()
      Returns the SHORT_NAME for the POSTScript implementation that is used to be as default with this GridStart implementation.
      boolean enable​(AggregatedJob job, boolean isGlobusJob)
      Enables a job to run on the grid.
      boolean enable​(Job job, boolean isGlobusJob)
      Enables a job to run on the grid by launching it directly.
      private void enableForWorkerNodeExecution​(Job job, boolean isGlobusJob)
      Enables jobs for worker node execution.
      java.lang.String generateListofFilenamesFile​(java.util.Set files, java.lang.String basename)
      Writes out the list of filenames file for the job.
      private java.lang.String getDirectoryKey​(Job job)
      Returns the directory that is associated with the job to specify the directory in which the job needs to run
      protected java.lang.String getPathToChmodExecutable​(java.lang.String site)
      Returns the path to the chmod executable for a particular execution site by looking up the transformation executable.
      protected java.lang.String getSubmitHostPathToPegasusLiteCommon()
      Determines the path to common shell functions file that Pegasus Lite wrapped jobs use.
      java.lang.String getVDSKeyValue()
      Returns the value of the vds profile with key as Pegasus.GRIDSTART_KEY, that would result in the loading of this particular implementation.
      java.lang.String getWorkerNodeDirectory​(Job job)
      Returns the directory in which the job executes on the worker node.
      void initialize​(PegasusBag bag, ADag dag)
      Initializes the GridStart implementation.
      private boolean removeDirectoryKey​(Job job)
      Returns a boolean indicating whether to remove remote directory information or not from the job.
      protected java.lang.String retrieveLocationForWorkerPackageFromTC​(java.lang.String site)
      Retrieves the location for the pegasus worker package from the TC for a site
      protected boolean setXBitOnFile​(java.lang.String file)
      Sets the xbit on the file.
      java.lang.String shortDescribe()
      Returns a short textual description in the form of the name of the class.
      protected java.lang.StringBuffer slurpInFile​(java.lang.String directory, java.lang.String file)
      Convenience method to slurp in contents of a file into memory.
      void useFullPathToGridStarts​(boolean fullPath)
      Setter method to control whether a full path to Gridstart should be returned while wrapping a job or not.
      protected java.io.File wrapJobWithPegasusLite​(Job job, boolean isGlobusJob)
      Generates a seqexec input file for the job.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • mDAG

        private ADag mDAG
      • CLASSNAME

        public static final java.lang.String CLASSNAME
        The basename of the class that is implmenting this. Could have been determined by reflection.
        See Also:
        Constant Field Values
      • SHORT_NAME

        public static final java.lang.String SHORT_NAME
        The SHORTNAME for this implementation.
        See Also:
        Constant Field Values
      • PEGASUS_LITE_COMMON_FILE_BASENAME

        public static final java.lang.String PEGASUS_LITE_COMMON_FILE_BASENAME
        The basename of the pegasus lite common shell functions file.
        See Also:
        Constant Field Values
      • XBIT_TRANSFORMATION

        public static final java.lang.String XBIT_TRANSFORMATION
        The logical name of the transformation that creates directories on the remote execution pools.
        See Also:
        Constant Field Values
      • XBIT_EXECUTABLE_BASENAME

        public static final java.lang.String XBIT_EXECUTABLE_BASENAME
        The basename of the pegasus dirmanager executable.
        See Also:
        Constant Field Values
      • XBIT_TRANSFORMATION_NS

        public static final java.lang.String XBIT_TRANSFORMATION_NS
        The transformation namespace for the setXBit jobs.
        See Also:
        Constant Field Values
      • XBIT_TRANSFORMATION_VERSION

        public static final java.lang.String XBIT_TRANSFORMATION_VERSION
        The version number for the derivations for setXBit jobs.
      • XBIT_DERIVATION_NS

        public static final java.lang.String XBIT_DERIVATION_NS
        The derivation namespace for the setXBit jobs.
        See Also:
        Constant Field Values
      • XBIT_DERIVATION_VERSION

        public static final java.lang.String XBIT_DERIVATION_VERSION
        The version number for the derivations for setXBit jobs.
      • PEGASUS_LITE_EXITCODE_SUCCESS_MESSAGE

        public static final java.lang.String PEGASUS_LITE_EXITCODE_SUCCESS_MESSAGE
        The pegasus lite exitcode success message.
        See Also:
        Constant Field Values
      • mMajorVersionLevel

        private java.lang.String mMajorVersionLevel
        Stores the major version of the planner.
      • mMinorVersionLevel

        private java.lang.String mMinorVersionLevel
        Stores the major version of the planner.
      • mPatchVersionLevel

        private java.lang.String mPatchVersionLevel
        Stores the major version of the planner.
      • mLogger

        protected LogManager mLogger
        The LogManager object which is used to log all the messages.
      • mProps

        protected PegasusProperties mProps
        The object holding all the properties pertaining to Pegasus.
      • mSubmitDir

        protected java.lang.String mSubmitDir
        The submit directory where the submit files are being generated for the workflow.
      • mGenerateLOF

        protected boolean mGenerateLOF
        A boolean indicating whether to generate lof files or not.
      • mWorkerNodeExecution

        protected boolean mWorkerNodeExecution
        A boolean indicating whether to have worker node execution or not.
      • mSLS

        protected SLS mSLS
        The handle to the SLS implementor
      • mPOptions

        protected PlannerOptions mPOptions
        The options passed to the planner.
      • mSiteStore

        protected SiteStore mSiteStore
        Handle to the site catalog store.
      • mEnablingPartOfAggregatedJob

        protected boolean mEnablingPartOfAggregatedJob
        An instance variable to track if enabling is happening as part of a clustered job. See Bug 21 comments on Pegasus Bugzilla
      • mKickstartGridStartImpl

        private Kickstart mKickstartGridStartImpl
        Handle to kickstart GridStart implementation.
      • mStageSLSFile

        protected boolean mStageSLSFile
        Boolean to track whether to stage sls file or not
      • mLocalPathToPegasusLiteCommon

        protected java.lang.String mLocalPathToPegasusLiteCommon
        The local path on the submit host to pegasus-lite-common.sh
      • mTransferWorkerPackage

        protected boolean mTransferWorkerPackage
        Boolean indicating whether worker package transfer is enabled or not
      • mWorkerPackageMap

        java.util.Map<java.lang.String,​java.lang.String> mWorkerPackageMap
        A map indexed by execution site and the corresponding worker package location in the submit directory
      • mChmodOnExecutionSiteMap

        private java.util.Map<java.lang.String,​java.lang.String> mChmodOnExecutionSiteMap
        A map indexed by the execution site and value is the path to chmod on that site.
    • Constructor Detail

      • PegasusLite

        public PegasusLite()
    • Method Detail

      • initialize

        public void initialize​(PegasusBag bag,
                               ADag dag)
        Initializes the GridStart implementation.
        Specified by:
        initialize in interface GridStart
        Parameters:
        bag - the bag of objects that is used for initialization.
        dag - the concrete dag so far.
      • enable

        public boolean enable​(AggregatedJob job,
                              boolean isGlobusJob)
        Enables a job to run on the grid. This also determines how the stdin,stderr and stdout of the job are to be propogated. To grid enable a job, the job may need to be wrapped into another job, that actually launches the job. It usually results in the job description passed being modified modified.
        Specified by:
        enable in interface GridStart
        Parameters:
        job - the Job object containing the job description of the job that has to be enabled on the grid.
        isGlobusJob - is true, if the job generated a line universe = globus, and thus runs remotely. Set to false, if the job runs on the submit host in any way.
        Returns:
        boolean true if enabling was successful,else false.
      • enable

        public boolean enable​(Job job,
                              boolean isGlobusJob)
        Enables a job to run on the grid by launching it directly. It ends up running the executable directly without going through any intermediate launcher executable. It connects the stdio, and stderr to underlying condor mechanisms so that they are transported back to the submit host.
        Specified by:
        enable in interface GridStart
        Parameters:
        job - the Job object containing the job description of the job that has to be enabled on the grid.
        isGlobusJob - is true, if the job generated a line universe = globus, and thus runs remotely. Set to false, if the job runs on the submit host in any way.
        Returns:
        boolean true if enabling was successful,else false in case when the path to kickstart could not be determined on the site where the job is scheduled.
      • enableForWorkerNodeExecution

        private void enableForWorkerNodeExecution​(Job job,
                                                  boolean isGlobusJob)
        Enables jobs for worker node execution.
        Parameters:
        job - the job to be enabled.
        isGlobusJob - is true, if the job generated a line universe = globus, and thus runs remotely. Set to false, if the job runs on the submit host in any way.
      • canSetXBit

        public boolean canSetXBit()
        Indicates whether the enabling mechanism can set the X bit on the executable on the remote grid site, in addition to launching it on the remote grid stie
        Specified by:
        canSetXBit in interface GridStart
        Returns:
        false, as no wrapper executable is being used.
      • getVDSKeyValue

        public java.lang.String getVDSKeyValue()
        Returns the value of the vds profile with key as Pegasus.GRIDSTART_KEY, that would result in the loading of this particular implementation. It is usually the name of the implementing class without the package name.
        Specified by:
        getVDSKeyValue in interface GridStart
        Returns:
        the value of the profile key.
        See Also:
        org.griphyn.cPlanner.namespace.Pegasus#GRIDSTART_KEY
      • shortDescribe

        public java.lang.String shortDescribe()
        Returns a short textual description in the form of the name of the class.
        Specified by:
        shortDescribe in interface GridStart
        Returns:
        short textual description.
      • defaultPOSTScript

        public java.lang.String defaultPOSTScript()
        Returns the SHORT_NAME for the POSTScript implementation that is used to be as default with this GridStart implementation.
        Specified by:
        defaultPOSTScript in interface GridStart
        Returns:
        the identifier for the default POSTScript implementation for kickstart gridstart module.
        See Also:
        Kickstart.defaultPOSTScript()
      • getDirectoryKey

        private java.lang.String getDirectoryKey​(Job job)
        Returns the directory that is associated with the job to specify the directory in which the job needs to run
        Parameters:
        job - the job
        Returns:
        the condor key . can be initialdir or remote_initialdir
      • removeDirectoryKey

        private boolean removeDirectoryKey​(Job job)
        Returns a boolean indicating whether to remove remote directory information or not from the job. This is determined on the basis of the style key that is associated with the job.
        Parameters:
        job - the job in question.
        Returns:
        boolean
      • construct

        private void construct​(Job job,
                               java.lang.String key,
                               java.lang.String value)
        Constructs a condor variable in the condor profile namespace associated with the job. Overrides any preexisting key values.
        Parameters:
        job - contains the job description.
        key - the key of the profile.
        value - the associated value.
      • generateListofFilenamesFile

        public java.lang.String generateListofFilenamesFile​(java.util.Set files,
                                                            java.lang.String basename)
        Writes out the list of filenames file for the job.
        Parameters:
        files - the list of PegasusFile objects contains the files whose stat information is required.
        basename - the basename of the file that is to be created
        Returns:
        the full path to lof file created, else null if no file is written out.
      • getWorkerNodeDirectory

        public java.lang.String getWorkerNodeDirectory​(Job job)
        Returns the directory in which the job executes on the worker node.
        Specified by:
        getWorkerNodeDirectory in interface GridStart
        Parameters:
        job -
        Returns:
        the full path to the directory where the job executes
      • wrapJobWithPegasusLite

        protected java.io.File wrapJobWithPegasusLite​(Job job,
                                                      boolean isGlobusJob)
        Generates a seqexec input file for the job. The function first enables the job via kickstart module for worker node execution and then retrieves the commands to put in the input file from the environment variables specified for kickstart. It creates a single input file for the seqexec invocation. The input file contains commands to
         1) create directory on worker node
         2) fetch input data files
         3) execute the job
         4) transfer the output data files
         5) cleanup the directory
         
        Parameters:
        job - the job to be enabled.
        isGlobusJob - is true, if the job generated a line universe = globus, and thus runs remotely. Set to false, if the job runs on the submit host in any way.
        Returns:
        the file handle to the seqexec input file
      • convertToTransferInputFormat

        protected java.lang.StringBuffer convertToTransferInputFormat​(java.util.Collection<FileTransfer> files)
        Convers the collection of files into an input format suitable for the transfer executable
        Parameters:
        files - Collection of FileTransfer objects.
        Returns:
        the blurb containing the files in the input format for the transfer executable
      • slurpInFile

        protected java.lang.StringBuffer slurpInFile​(java.lang.String directory,
                                                     java.lang.String file)
                                              throws java.io.IOException
        Convenience method to slurp in contents of a file into memory.
        Parameters:
        directory - the directory where the file resides
        file - the file to be slurped in.
        Returns:
        StringBuffer containing the contents
        Throws:
        java.io.IOException
      • getPathToChmodExecutable

        protected java.lang.String getPathToChmodExecutable​(java.lang.String site)
        Returns the path to the chmod executable for a particular execution site by looking up the transformation executable.
        Parameters:
        site - the execution site.
        Returns:
        the path to chmod executable
      • setXBitOnFile

        protected boolean setXBitOnFile​(java.lang.String file)
        Sets the xbit on the file.
        Parameters:
        file - the file for which the xbit is to be set
        Returns:
        boolean indicating whether xbit was set or not.
      • getSubmitHostPathToPegasusLiteCommon

        protected java.lang.String getSubmitHostPathToPegasusLiteCommon()
        Determines the path to common shell functions file that Pegasus Lite wrapped jobs use.
        Returns:
        the path on the submit host.
      • useFullPathToGridStarts

        public void useFullPathToGridStarts​(boolean fullPath)
        Description copied from interface: GridStart
        Setter method to control whether a full path to Gridstart should be returned while wrapping a job or not.
        Specified by:
        useFullPathToGridStarts in interface GridStart
        Parameters:
        fullPath - if set to true, indicates that full path would be used.
      • associateCredentials

        private void associateCredentials​(Job job,
                                          java.util.Collection<FileTransfer> files)
        Associates credentials with the job corresponding to the files that are being transferred.
        Parameters:
        job - the job for which credentials need to be added.
        files - the files that are being transferred.
      • retrieveLocationForWorkerPackageFromTC

        protected java.lang.String retrieveLocationForWorkerPackageFromTC​(java.lang.String site)
        Retrieves the location for the pegasus worker package from the TC for a site
        Returns:
        the path to worker package tar file on the site, else null if unable to determine
      • complainForHeadNodeFileServer

        private void complainForHeadNodeFileServer​(java.lang.String jobname,
                                                   java.lang.String site)
        Complains for a missing head node file server on a site for a job
        Parameters:
        jobname - the name of the job
        site - the site