Fixed Bugs for IBM Platform LSF Version Fix Pack 3

Size: px
Start display at page:

Download "Fixed Bugs for IBM Platform LSF Version Fix Pack 3"

Transcription

1 Fixed Bugs for IBM Platform LSF Version Fix Pack 3 The following bugs have been fixed in LSF Version Fix Pack 3 between 30 th May 2014 and 8 th June 2015: P Date When executing a job containing multiple tasks, the task RES calculates an incorrect XDR size and causes an XDR encoding error. Component res Jobs that are launched by blaunch fail to execute P Date In MultiCluster lease mode, a buffer overflow occurs when the lsb.lease.state file is too large, causing an mbatchd core dump. mbatchd core dumps and LSF is unrecoverable P Date In MultiCluster forward mode, a parallel job submitted with the same section as RES_REQ is blocked. Component mbschd schmod_mc.so Jobs remain pending indefinitely on the submission cluster.

2 P Date When the lsb_submit() API is used to submit jobs for both a parent and child process, jobs submitted for the child process do not process LSB_SUB_MODIFY_FILE and LSB_SUB_MODIFY_ENVFILE properly. Component liblsf.a liblsb.so libbat.a libbat.so lsbatch.h lsf.h esub does not work with the lsb_submit API P Date When checkpoint jobs are submitted by script, brestart -W does not work and restarted jobs cannot be terminated by RUNLIMIT. Component sbatchd erestart After running brestart, the job cannot exit and keeps showing "Checkpoint initiated" and "Checkpoint succeeded" in sequence P Date When mbschd encounters an error on a job, other jobs do not get scheduled and remain pending. Component mbschd Jobs remain pending until an administrator runs badmin reconfig P Date There are problems with guarantee and preemption when consumer jobs that are high priority and guaranteed remain pending even when resource requirements are met. Component mbschd schmod_default.so Some pending jobs cannot run even when resource requirements are met.

3 P Date When using bsub to submit a job, a core dump occurs if the specified command and its arguments contain multiple quotations. Component bsub Cannot submit jobs if the command and its arguments contain multiple quotations P Date If there are many bhosts requests and an affinity host in the cluster (or if affinity is enabled in the cluster), after enabling LSB_QUERY_ENH in lsf.conf, the query child mbatchd core dumps repeatedly. The core dump is caused by a "thread unsafe function". Child mbatchd core dumps, which causes b* query commands to fail P Date When a license error occurs ("Unable to contact LIM"), mbatchd exits. The error message is confusing P Date When preemption and guaranteed SLA are both enabled, mbschd will take a long time to finish one scheduling cycle, especially when there are tens of thousands of pending jobs. Component mbschd An mbschd performance issue causes low job throughput in the LSF cluster.

4 P Date When LSF_TMPDIR is set to a shared file system, esub sometimes does not work because the temp file for one job is overwritten by another job. Component bsub Wrong or missing job submission options set in esub P Date When shared resources are configured for the cluster, vemkd reports a warning message: "lsfinit: resource <resource_name> is being used by multiple hosts. It cannot be used in a resource requirement expression." Component vemkd Several error messages are logged in the vemkd log file even with the correct configuration P Date If sbatchd is not responding or is unavailable when mbatchd attempts to send modification information to it, sbatchd will never receive the modification information. sbatchd Using bmod to change a running job s run time limit does not take effect when sbatchd is unavailable P Date In a mixed cluster environment, using a bpeek command on a job that is running on another host occasionally fails. Component bpeek bpeek occasionally does not work.

5 P Date When using the command brestart with GOLD integration jobs, the command fails due to some missing job information such as a project name or job ID. Component brestart Jobs restarted with brestart does not work well with GOLD integration P Date When running a long pre-execution job, if the job is killed before finishing the pre-execution script, running eexec cannot get the environment variable LSF_JOB_EXECUSER. Component sbatchd Several GOLD reservations are not released if gcharge fails and the job is killed P Date Cannot backfill the reserved job when the job is an exclusive job. Component mbschd schmod_reserve.so Short jobs cannot use backfill slots Date When advance reservation files exist (lsb.rsv.id, lsb.rsv.stat), LSF should log messages at the ERROR level if mbatchd fails to open the files for reading. 2. When advance reservation files do not exist, LSF should log messages at the LOG_INFO level. 3. mbatchd should always use the LSF primary administrator account to access these files under LSB_SHAREDIR. Currently, when mbatchd starts up, mbatchd uses root to read advance reservation files. User does not know when the advance reservation file is not accessible.

6 P Date Performance enhancements for LSF scheduler. Each is enabled with a parameter in lsf.conf: 1. LSB_SHARED_RSRC_ENH=Y LSF allows you to configure multiple instances of a (site-defined) shared resource. For example, for shared resource "R", there may be one instance consisting of 10 units of R that is available on hosts 1 and 2, and a second instance consisting of 10 units of R that is available on hosts 3 and 4. Each host can be associated with up to one instance. If a job specifies a shared resource in its rusage string and LSF discovers that the job cannot use one host because of a lack of the resource, other hosts are also checked since there may be multiple instances of the resource. In the special case of a single resource instance for the cluster (for example, representing a floating software license), LSF would ideally not consider any other hosts for the job. When you set LSB_SHARED_RSRC_ENH=Y, after LSF finds that an insufficient amount of a single-instance shared resource is available on one host, LSF will not consider other hosts for the job. 2. LSB_SKIP_FULL_HOSTS=Y LSF removes unusable hosts from consideration at the beginning of each scheduling session. For example, hosts that are down (unavail or unreach), closed by the administrator (closed_adm), or closed due to a load threshold (closed_busy), are unusable by any job and can be removed from consideration. Removing these hosts early in the scheduling session improves performance. Hosts with all slots occupied (closed_full) are not removed, since these hosts can still be used by jobs in preemptive queues, if queue-based preemption is enabled. For sites without preemption configured, it is not necessary for LSF to consider full hosts. When you set LSB_SKIP_FULL_HOSTS=Y, LSF removes full hosts from consideration at the beginning of each scheduling session, as long as either the preemption plug-in is not loaded, or there is no preemption relationship between queues. For more details, see the PREEMPTION parameter in lsb.queues. 3. LSB_DISABLE_PROJECT_LIMITS=Y Internally, LSF puts jobs with like attributes into "buckets" for scheduling efficiency. The idea is that if LSF cannot dispatch one job in a job bucket during a scheduling session, LSF can generally assume that the rest of the jobs in the same bucket cannot be dispatched either, and therefore do not need to be considered. In general, fewer job buckets leads to better scheduling performance. Use "badmin perfmon view" to see the current number of job buckets in your cluster. By default, LSF separates jobs with different projects (given by the bsub P option) into different buckets. The reason for this is to handle project-based limits. For example, a limit on project P1 of 25 slots. If you do not configure project limits, you can set LSB_DISABLE_PROJECT_LIMITS=Y to prevent LSF from separating jobs into buckets based on project name. When this is enabled, LSF will ignore configured project limits. 4. LSB_FAST_REQUEST_NEW_JOBS=Y This parameter reduces the time taken in communicating newly submitted jobs from

7 mbatchd to mbschd. 5. LSB_SHARE_LOCATION_ENH=Y This parameter improves LSF performance by reducing the sizes of messages passed between mbatchd and mbschd. By default, messages between these daemons identify each instance of a shared resource by the list of hosts corresponding to that instance. After you set LSB_SHARE_LOCATION_ENH=Y, each instance is assigned an integer ID that is used for communication. This enhancement is especially useful for sites with several configured shared resources. mbschd Lower mbschd performance has a large impact on job dispatching rate in a cluster P Date This fix introduces five performance enhancements for the LSF scheduler. You must individually enable each of the following enhancements with a parameter in lsf.conf: 1. LSB_SHARED_RSRC_ENH=Y LSF allows you to configure multiple instances of a (site-defined) shared resource. For example, for a shared resource "R", there may be one instance consisting of 10 units of R that is available on hosts 1 and 2, and a second instance consisting of 10 units of R that is available on hosts 3 and 4. Each host can be associated with up to one instance. If a job specifies a shared resource in its rusage string and LSF discovers that the job cannot use one host because of a lack of the resource, other hosts are also checked since there may be multiple instances of the resource. In the special case of a single resource instance for the cluster (for example, representing a floating software license), LSF would ideally not consider any other hosts for the job. When you set LSB_SHARED_RSRC_ENH=Y, after LSF finds that an insufficient amount of a single-instance shared resource is available on one host, LSF will not consider other hosts for the job. 2. LSB_SKIP_FULL_HOSTS=Y LSF removes unusable hosts from consideration at the beginning of each scheduling session. For example, hosts that are down (unavail or unreach), closed by the administrator (closed_adm), or closed due to a load threshold (closed_busy), are unusable by any job and can be removed from consideration. Removing these hosts early in the scheduling session helps with performance. Hosts with all slots occupied (closed_full) are not removed, since these hosts can

8 Component still be used by jobs in preemptive queues, if queue-based preemption is enabled. For sites without preemption configured, it is not necessary for LSF to consider full hosts. When you set LSB_SKIP_FULL_HOSTS=Y, LSF removes full hosts from consideration at the beginning of each scheduling session, as long as either the preemption plug-in is not loaded, or there is no preemption relationship between queues (for more details, see the PREEMPTION parameter in lsb.queues). mbatchd mbschd schmod_default.so schmod_reserve.so schmod_preemption.so schmod_affinity.so schmod_parallel.so schmod_advrsv.so schmod_aps.so schmod_bluegene.so schmod_cpuset.so schmod_craylinux.so schmod_crayx1.so schmod_dc.so schmod_dist.so schmod_fairshare.so schmod_fcfs.so schmod_jobweight.so schmod_limit.so schmod_mc.so schmod_ps.so schmod_pset.so schmod_rms.so schmod_xl.so Lower mbschd performance has a large impact on job dispatching rate in a cluster P Date An issue with the LSF library can cause the NIOS process to enter a busy loop, resulting in issues with MPI job performance. This issue is triggered when the stdin of blaunch is neither /dev/null nor FIFO. When running fluent under Platform MPI, the stdin of blaunch is redirected to a socket, which triggers this issue. Component nios Configured guarantee cannot be held P Date Incorrect behavior when using bsub -I to submit a job containing "&&" in the command line (for example, bsub -I echo "test&&mg"). Component bsub Job command is changed by LSF if && is used in the command line.

9 P Date If more than 1000 resources are defined in lsf.shared, then lsadmin, and badmin will core dump. Component lsadmin badmin Cannot use lsadmin and badmin to start up LSF daemons when more than 1000 resources are defined in lsf.shared P Date A job with span[ptile=x] may cause an mbschd core dump. Component mbschd mbschd core dumps, causing a low job dispatch rate P Date When a parallel job finishes, LSF reports an incorrect MAXMEM value. Component res LSF reports a larger than actual memory usage, impacting the analysis of parallel jobs P Date When brequeue is used to re-queue a job to pending status, the run time value in lsb.stream is incorrect. Therefore, the job is killed before resuming. Requeued jobs that are killed while pending are logged in the Platform Analytics database with a long run time.

10 P Date bjobs -l does not show the same effective resource requirement string if the originally-specified resource requirement string is longer than 512 bytes. Component mbschd If a different string is shown for the effective resource requirement, a user may think the job is dispatched incorrectly P Date Guaranteed resources are not held when there is a host in the closed_busy state. Component mbschd schmod_default.so Configured guarantees cannot be held P Date After running lsrun on some hosts, the lsload and lsload -E commands display incorrect r15s and r1m values for that host. Component lim lsload, lsload E, and lsload -N commands display incorrect results on some hosts after running lsrun P Date Compute unit resource requirements do not work with leased-in hosts. 1. This fix allows for the specification of leased-in hosts in the definition of a compute unit. For example: Begin ComputeUnit NAME MEMBER TYPE en1 (host1@mc1) (enclosure) en2 (ho*@mc1) (enclosure) End ComputeUnit

11 Note: A valid name of a leased-in host must be defined for the MEMBER column. The badmin reconfig command does not log an error or warning message if you specify an invalid host name. mbatchd only logs the error or warning message in the mbatchd log file after mbatchd gets the leased-in host information from the remote cluster. 2. This fix allows dynamic hosts and leased-in hosts to join in a compute unit (after running badmin reconfig to apply the changes). This allows jobs with a compute unit resource requirement to be dispatched to the new dynamic hosts and leased-in hosts. mbschd Users cannot specify compute unit resource requirements for leased-in hosts P Date When there are no more records, lsb_readjobinfo() returns error code 53 instead of 47. Component liblsf.a liblsf.so libbat.a libbat.so lsf.h lsbatch.h The LSF API client code fails to detect the case when there are no more job records in mbatchd P Date If using both compute unit and affinity, some jobs may cause an mbschd core dump. Component mbschd schmod_parallel.so schmod_reserve.so mbschd core dumps, causing a low job dispatch rate P Date CPU binding does not work with LSF_BIND_JOB after LSF is started using the 'lsf_daemons start' command. Component sbatchd Jobs cannot be bound after running 'lsf_daemons start'.

12 P Date When running badmin reconfig, mbatchd fails to receive the system user name or system group name. Therefore, after badmin reconfig, jobs cannot be submitted when specifying a group name using -G due an gives an "Unknown user or user group" error. sbatchd New jobs specifying a group name cannot be submitted after running badmin reconfig P Date Dynamic hosts change to "closed_inactive" status if it is not a Platform MultiCluster host. Dynamic host cannot be used after it is changed to "closed_inactive" P Date The Performance Application Programming Interface (PAPI) conflicts with hardware counter data collection in LSF. When LSF_COLLECT_ENERGY_USAGE in lsf.conf is set to "Y", jobs submitted with energy aware scheduling options and using PAPI do not trigger hardware counters correctly. This fix introduces the esub environment variable LSB_SUB4_COLLECT_ENERGY_USAGE that allows LSF to collect energy-related usage data at the job level, narrowing down the cluster level energy usage data collection. Job does not trigger energy usage data collection.

13 P Date bjobs/bhist sometimes shows the wrong signal number when the job reaches the run limit. Component bjobs bhist End users do not know the actual job exit reason P Date When LSF kills a job that is part of a job dependency condition (that is, when LSF kills a job that other jobs depend on), mbatchd takes a long period of time to restart. mbatchd is busy evaluating job dependencies, causing LSF to stop working P Date When a dynamic host with exclusive resources joins the cluster, the exclusive resources disappear from the dynamic host after re-configuring the LIM. Component lim After reconfiguring the LIM, exclusive resources are lost from the dynamic host P Date Job finish time includes the post-execution processing time, which impacts Platform RTM statistics. RTM does not report the correct job finish time.

14 P Date When $LSF_ENVDIR is not set, elim.hpc does not check if /etc/lsf.conf exists. When $LSF_BINDIR is not set, elim.hpc has a security hole that can cause normal users to gain root permissions. Component elim.hpc elim.hpc does not work properly if $LSF_ENVDIR or $LSF_BINDIR are not set P Date mbatchd may fail when any line of lsb.users is longer than 4352 characters. mbatchd core dumps and LSF no longer works P Date When submitting jobs with memory requirements and specifying "span[hosts=1]", if there are no hosts in the cluster that can meet the memory requirement, LSF still makes a slot reservation for the job. Component mbschd schmod_default.so A high priority job reserves resources but cannot run on the host, causing a waste of resources P Date The unit for the mem, swp, and tmp thresholds that lshosts displays is not changed after defining a different unit using LSF_UNIT_FOR_LIMITS in lsf.conf. Component lshosts The lshosts output is incorrect.

15 P Date When configuring shared resources only on dynamic hosts, the master LIM core dumps if all the dynamic hosts are removed from the cluster. Component lim LIM core dumps and LSF no longer works P Date Job execution initialization fails if the execution host cannot resolve the submission host. This fix introduces the following parameter: LSB_DISABLE_SUB_HOST_LOOKUP Configured in lsf.conf Syntax LSB_DISABLE_SUB_HOST_LOOKUP=Y N Disable submission host name lookup when executing jobs. When this parameter is set, the job execution sbatchd does not look up the submission host name when executing or cleaning up the job. LSF will not be able to do any host-dependent automounting. Default Set to N. LSF will look up the submission host name when executing jobs. Component sbatchd badmin The job cannot run on some hosts.

16 P Date Due to a transient name resolution causing mbatchd/sbatchd communication issues, finished jobs are reported as running and cannot be killed with bkill. When the compute host s host name is incorrect, the master host cannot receive the job status. Therefore, the system keeps the job in a run status and it cannot be killed. This fix assumes that the host names configured for the LSF cluster are the same as the official names configured in the DNS server or /etc/hosts. Both host names (LSF cluster and DNS server) may include the domain (or not), but they must match. mbatchd keeps jobs in running status and end users cannot use bkill to kill their jobs P Date If a blaunch job is submitted using 'bsub -i' with a large input file (larger than 8192 bytes), the job hangs. Component res LSF jobs hang when a very large input file is specified P Date Improvement for job scheduling performance when there are several single-host parallel job buckets. Component mbschd schmod_parallel.so Lower mbschd performance has a large impact on job dispatch rate in a cluster.

17 P Date When the status of hosts becomes UNAVAIL and if there is advanced reservation defined in the hosts, warning messages do not clarify whether there are more slots reserved than are available on the host, or if the problem is that the status became UNAVAIL. The warning message is misleading and does not tell users how to avoid it P Date When resizing the terminal window of an interactive blaunch job, the job exits with the SIGPROF signal. Component res Parallel jobs are unexpectedly killed by LSF P Date If a user group is updated with egroup using EGROUP_UPDATE_INTERVAL=1 as defined in lsb.params, and a new user is added to the user group, that user's MAX_JOBS value does not display the correct value. Component Mbatchd The wrong MAX_JOBS value applies to LSF users.

18 P Date Add a parameter to lsf.conf to control the memory usage report when using cgroup. Syntax LSB_CGROUP_MEM_INCLUDE_CACHE=Y N y n When set to "Y/y", LSF includes rss and cache in the memory usage report when cgroup is enabled. When set to "N/n", LSF only includes rss in the memory usage report when cgroup is enabled. Default Set to Y Component sbatchd res LSF reports that a job is using more memory than the actual use, causing the job to be unexpectedly killed P Date It takes more than 10 minutes to kill an array job with approximately 1000 elements. Array jobs cannot be killed in a short time, causing slots to go unused P Date bhosts and busers may report an incorrect reserved slots value when time-based slot reservation is enabled. Component mbschd schmod_reserve.so Command output is wrong which may confuse end users.

19 P Date A file descriptor limit that is set higher than is not respected by sbatchd and RES. Component sbatchd Jobs depending on a large number of open files fail to run P Date If a Platform MPI job is terminated because a task on the first node ran over the memory limit and is killed by cgroup memory fencing, the bjobs, bhist, and bacct commands cannot display the job exit reason and finished resource usage properly. Component sbatchd Job accounting information is incorrect P Date When submitting jobs with memory requirements ("rusage[mem=value]") and processor requirements ("span[ptile=value]"), there is an incorrect reservation of hosts that do not have enough memory. Component mbschd schmod_parallel.so schmod_reserve.so A high priority job reserves resources but cannot run on the host, causing wasted resources P Date When defining "LSB_QUERY_ENH=Y" in lsf.conf and performing several queries, the query child mbatchd might core dump. Child mbatchd core dumps, causing b* query commands to fail.

20 P Date There is an inconsistent return value of a user job because whether the job script process is killed by the SIGXFSZ signal depends on if the job command redirects stdout. This fix introduces the following parameter: LSB_JOB_SCRIPT_TRAP_SIGNALS Configured in lsf.conf. Syntax LSB_JOB_SCRIPT_TRAP_SIGNALS=signal_name... A list of the names of signals that are trapped by the job scripts. This parameter prevents the specified signals from killing the job script process. By default, the job scripts trap the SIGTERM, SIGUSR1, SIGUSR2, SIGINT, and SIGHUP signals, so you do not have to define these signals in this parameter. Because the job scripts cannot trap the SIGSTOP and SIGKILL signals, these values are not valid. Valid values: A space-separated list of signal names. The first 31 signals are valid (from SIGHUP to SIGSYS), except for SIGSTOP and SIGKILL. This parameter is not supported on Windows platforms. Default Undefined. The job script does not trap any additional signals except SIGTERM, SIGUSR1, SIGUSR2, SIGINT, and SIGHUP. Component sbatchd mbatchd keeps jobs in running status and end users cannot use bkill to kill their jobs P Date Scheduling parallel jobs may cause slow mbschd performance. Component mbschd schmod_default.so schmod_parallel.so schmod_reserve.so Lower mbschd performance has a large impact on the job dispatch rate in a cluster.

21 P Date If a krb5 ticket renewal fails, the log messages are not sufficient to assist with troubleshooting. Component Krbrenewd Insufficient warning and error messages makes it difficult to debug problems P Date If a job dependency condition is "ended(jobid)", the dependency is broken when the parent job is requeued. Job dependency is broken in some cases P Date mbatchd does not accept user group names that end with a backslash ("/"). LSF administrators cannot configure user group names ending with a backslash ( / ) P Date When running bpost on an execution cluster running LSF (or newer) with a submission cluster running LSF 7.0.6, the execution cluster mbatchd core dumps if the submission cluster mbatchd connection is lost and then reconnects. mbatchd core dumps and LSF no longer works.

22 P Date After upgrading the LSF cluster to version 9.1.3, the process tracking information for jobs that were still unfinished before upgrading is lost and cannot be recovered. This is because LSF changes the cgroup information file name format, so the old cgroup information files are no longer recognized by LSF Component sbatchd Cannot collect jobs run time usage information after upgrading to P Date When a newly-installed LSF cluster starts up, the master elim may report the following error message in the log file: readloadupdatefromsubelim: Protocol error: loadcnt cannot be read from elim This error message is a false alarm. The root cause is that some elims may start, but quickly exit with ELIM_ABORT_VALUE. A race condition might happen where the master elim reads the exited child elim process before receiving the SIGCHLD signal of the child, in which case the read fails and the master elim displays this error message. Component melim The error message gives LSF administrators concerns about LSF product quality P Date The job run time recorded in the lsb.acct file is incorrect when the job is UNKNOWN and mbatchd is restarted. The incorrect job run time recorded in lsb.acct causing RTM to report incorrect job information.

23 P Date When running SGI MPI jobs under pam, the CPU time report is incorrect. Component pam End users do not receive the correct CPU time usage of their parallel jobs within pam P Date bmgroup takes a long time to show the new dynamic hosts, and it takes a long time (about 10 minutes) before the new dynamic hosts start accepting jobs. It takes a long time for users to know that a dynamic host is ready to use P Date If using a host partition configured with a host group, mbatchd might core dump. mbatchd core dumps and LSF no longer works P Date When the argument to blimits -u or -q is part of the actual user or queue, the actual user or queue will still be shown. This fix restricts the argument, and does not make any expansion. Component blimits blimits -u or -q show some limits information that it should not show.

24 P Date If the job has the span[ptile='!'] resource requirement, but the user who submitted the job did not define MXJ for any host type/model in lsb.hosts, and the user also did not specify a slot requirement for any host type/model in the span[] clause of the job's submission command: - LSF or older versions ignore the span[ptile='!'] resource requirement and treat the job as an ordinary parallel job. - LSF does not ignore the span[ptile='!'] requirement but treats this clause as span[ptile=1]. This fix restores the previous LSF behavior for handling span[ptile='!'] resource requirements. Some jobs are pending even though there are enough resources P Date When mbatchd replays events and there are events that modify an entire job array to run in a large host group, mbatchd replays slowly and takes a long time to restart the cluster. LSF mbatchd is very slow to start up P Date Job execution failed under Ubuntu because /bin/sh is linked to /bin/dash in Ubuntu. Component sbatchd Job fails to start.

25 28092 P Date When killing a parallel job (submitted using blaunch), only the SIGKILL signal is received. Therefore it is hard for users to do job cleanup before the jobs are killed. Component sbatchd res blaunch Users cannot do job cleanup before the job is killed P Date When a line in lsb.users is longer than 4352 and you run badmin mbdrestart, running badmin ckconfig results in errors. However, the bconf command is successful. The lsb.users file cannot contain too many users in one line P Date If LSF_STRIP_DOMAIN is changed in lsf.conf, mbatchd -C may core dump. badmin reconfig does not work P Date bpeek and lsrun logs unnecessary warning messages when LSB_KRB_TGT_FWD=Y is set in lsf.conf to control the Ticket Granting Ticket (TGT) forwarding feature, but no TGT is found. Component bpeek lsrun Misleading error messages.

26 33435 P Date mbatchd may cause a core dump if mbatchd replays chunk job events. Cluster is down P Date When mbatchd fails to create child query mbatchd due to a heavy network load, bjobs hangs indefinitely. A workflow relying on a bjobs query stops working P Date If the run time of a job is longer than seconds (approximately 166 days) then the RUN column width exceeds seven characters. bhist -w does not separate the RUN and USUSP columns in the output (compared to bhist -l output). This results in the two columns being combined together. Component bhist Difficulty in seeing the RUN and USUSP output.

27 33980 P Date When setting up utilization (ut) at the queue level to schedule enough CPU for jobs, LSF should round up ut numbers, resulting in incorrect use of resources. For example, ut = 0.92 is 12 cores in a cluster and ut = 0.94 is 16 cores. When setting ut = 0.92 or 0.94 in lsb.queues, bqueues -l reports ut = 0.9. When setting ut = 0.96 in lsb.queues, bqueues -l reports ut = 1. Component bqueues Incorrect use of resources P Date When the system is too busy to release a port, it will cause the sbatchd restart to fail because the socket failed to initialize. Component sbatchd Compute nodes become unavailable which reduces compute capacity P Date Orphan jobs run when a dependant job gives an improper exit code. For example: Job3 depends on Job2 and Job2 depends on Job1. If Job1 dies unexpectedly, Job2 receives the TERM_ORPHAN_SYSTEM correctly. However, it gives an exit code of zero, which causes Job3 to run anyway. Dependant jobs are allowed to start when they should have been aborted.

28 33973 P Date If mbatchd replays job switch events, mbatchd may core dump bhist libbat.a libbat.so liblsbstream.so liblsf.a liblsf.so lsf.h lsbatch.h The cluster does not work P Date When using bsub -I < file, bsub uses tty as standard input, which is not the correct behavior. Component bsub bsub -I < file does not work as expected P Date CPU time is not correctly calculated by CGROUP when blaunch is used to submit parallel jobs on the first host that runs the job RES. Component sbatchd res The accuracy of the cputime accounting info is not reliable P Date During periods of high query load (and MAX_CONCURRENT_JOB_QUERY is set to attempt to improve it), bjobs attempts to query mbatchd every second. This results in poorer mbatchd performance. Component bjobs Poor mbatchd performance.

29 34564 P Date Queue-level pre-execution and queue-level host-based pre-execution scripts run with different user group memberships. Component sbatchd Host-based pre-execution scripts fail P Date When LSF uses epoll, and License Scheduler receives a duplicate mbatchd registration (for example, after you run badmin ckconfig), the connection between bld and mbatchd is broken. Therefore, License Scheduler will not receive any job information. No LS job related information cannot be sent to BLD, including running jobs and demands for tokens P Date When a preempting job cannot be dispatched due to the guarantee SLA policy, its pending reason is set to job level, which does not prevent similar jobs from scheduling the same session. Component mbschd schmod_default.so Poor performance P Date daemons.wrap logs unnecessary warning messages when restarting the parent sbatchd. Component sbatchd daemons.wrap Misleading error messages.

30 36747 P Date When running interactive jobs, NIOS may exit with exit code 255 even if the job completed successfully. Component res The bsub exit code is given at incorrect times P Date mbatchd sometimes dispatches jobs to unavailable (unavail) hosts after running badmin reconfig. The wrongly dispatched jobs fail P Date When the same job is requeued, then terminated, bacct considers this as two exited jobs for the "Total number of exited jobs" metric, even though the exit condition is for the same job. Therefore, bacct shows an incorrect number of exited jobs. Component bacct Inconsistent data shown for bhist and bacct P Date When running "blimits -w", the values of any limits based on EXTERNAL RESOURCES are truncated even though the command is run in wide mode. Component blimits blimits output is truncated.

31 39162 P Date Dependency on job name is rejected when the dependant job is attached to an empty SLA. Unexpected behavior. Difficult to debug root cause because there is no indication why the job submission is failing P Date When a job name includes special characters such as "%", running the "bjobs -o name" command may fail or display an incomplete job name. Component bjobs bjobs core dump P Date In some cases, LSF does not honor user-assigned priorities. Component schmod_parallel.so schmod_reserve.so mbschd Low priority jobs are dispatched first and block high priority jobs P Date bjobs shows an error message when the number of concurrent bjobs queries exceeds the value of the MAX_CONCURRENT_JOB_QUERY parameter specified in lsb.params. Component bjobs Some jobs may fail.

32 38996 P Date When sbatchd restarts, mbatchd sends a package containing information on running jobs to sbatchd. The calculated size of the package is incorrect, which may lead to a "package full" error. Jobs cannot be scheduled P Date When restarting lim and mbatchd, and submitting some jobs from a float client at the same time, the mbatchd log shows continuous getcommittedruntime error messages. Continuous error messages fill the log P Date The job efficiency calculation (in Platform RTM) is incorrect when the job is automatically re-queued, due to the job's exit code. This is because at the time of the automatic re-queue, the job's runtime is reset to zero but the job's CPU time is not. The job's CPU time must be reset to zero when the job is re-queued. Job runtime and cputime are inconsistent for rerun job

33 44462 P Date When configuring LSF_TMPDIR with a directory that is not /tmp, LSB_CHECK_JOB_PID_REUSE_ON_REBOOT does not work. Component sbatchd LSB_CHECK_JOB_PID_REUSE_ON_REBOOT does not work and the job PID can be reused again, causing LSF to think the job still running P Date When the system is too busy to release a port, it causes the lim/res restart to fail because the socket failed to initialize. Component lim res Failure of lim restart causes a failover. If during this event, there is a job submission, the job submission may fail P Date Setting the smoothing factor in the page rate report to a fixed value is inconvenient. This fix introduced a parameter to control the smoothing factor in the page rate report: Syntax EGO_LIM_PG_SMOOTH_FACTOR = smoothing_factor 0 Specifies the smoothing factor when lim reports the host page rate. The smoothing factor controls how fast reported values converge to an instantaneous value. The smoothing_factor value must be an integer between 0 and 10. If set to 0, no smoothing is applied and the reported value is equal to the instantaneous value. The larger the value, the more time LSF needs to react to page rate change in the host. This parameter is only supported in Linux platforms. Default Set to 4

34 Component lim When the index exceeds its threshold incorrectly, too much idle capacity per compute host is lost P Date When mbatchd dumps pending reasons, mbatchd incorrectly dumps "dumpcondensedpendingreasons as well. dumpcondensedpendingreasons fails in child mbatchd P Date Jobs defined with a memory-only guarantee may remain pending because host slots are used by higher priority jobs. Therefore, memory-only guarantee package pools do not work without a host slot guarantee. Performance slowness and deadlock P Date When the pending job count reaches more than 50,000, the scheduler experiences performance issues until the number of pending jobs decreases. mbschd schmod_default.so schmod_limit.so schmod_preemption.so Job scheduling is slow.

35 The following solutions have been done in LSF Version Fix Pack 3 between 30 th May 2014 and 8 th June 2015: Date Support for the -R option for the brestart command, to let end users change resource requirement of a restarted job. The syntax of the -R option of the brestart command is the same as the -R option of bsub and bmod commands. Component brestart Date Component This solution dumps the contents of the job buckets to a file in order to address the following issues: a) The smaller number of job buckets in the system will shorten the scheduling cycle. b) The total number of job buckets can be shown in the "badmin perfmon view" output. However, there was no easy way to see the job buckets themselves. c) There is no easy way to track down the cause of a large number of job buckets. To generate the dump file containing all the current job buckets in the system, run badmin diagnose -c jobreq. The file contains the job buckets in XML format by default. The default file name "jobreq_<host_name>_<date_and_time>.xml" is used if "-f logfile_name" is not specified. The file location is DIAGNOSE_LOGDIR if configured in lsb.params. Otherwise, the file is in LSF_LOGDIR. bapp badmin bhist bjobs bparams bqueues sbatchd mbatchd mbschd schmod_default.so schmod_parallel.so schmod_fairshare.so schmod_affinity.so schmod_advrsv.so schmod_dc.so

36 Date When using badmin ckconfig, LSF will check the host information from NIS or DNS. If the network is not stable and responds slowly, this process will take a long time, causing mbatchd to stop responding. The following parameter has been introduced in lsb.params: Syntax IGNORE_HOSTNAME_CHECK=Y yes N no If this parameter is enabled, LSF will ignore the checking for host information in NIS or DNS. Default N Date This fix allows LSF users or administrators to use wildcard characters in LSB_JOB_TMPDIR, JOB_SPOOL_DIR, job CWD and job output directories, including the following characters: - LSB_JOB_TMPDIR: %H - JOB_SPOOL_DIR: %H %P, %U, %C, and %JG - Job CWD and output directories: %H For more details on how to use these wild-card characters with LSF working on GPFS, refer to IBM Platform LSF Best Practices and Tips. sbatchd bparams Date Add support to perform logic after a job is submitted by bsub or after a job is modified by bmod. Similar to how esub scripts are run before job-submission or job-modification, espub scripts are run after the operation. Component bsub bmod brestart mesub

37 Date When the LSF_NIOS_PEND_TIMEOUT environment variable is set, interactive jobs cannot be executed after the LSF_NIOS_PEND_TIMEOUT value expires. The job is killed and returns a message such as "Job <xxx> is being terminated". You can use the LSF_NIOS_DIE_CMD environment variable to specify a customized command and output message when the LSF_NIOS_PEND_TIMEOUT value expires. See the following example: user@host1: setenv LSF_NIOS_PEND_TIMEOUT 1 user@host1: setenv LSF_NIOS_DIE_CMD "bkill %J > /dev/null; echo job %J is terminated by bkill;" user@host1: echo $LSF_NIOS_DIE_CMD bkill %J > /dev/null; echo job %J is terminated by bkill; user@host1: bsub -I "echo test" Job <16> is submitted to default queue <normal>. <<Waiting for dispatch...>> job 16 is terminated by bkill About the LSF_NIOS_DIE_CMD environment variable: 1.The default value is "bkill jobid" 2.LSF_NIOS_DIE_CMD supports the %J variable, so you can use the job ID when you specify the custom command for LSF_NIOS_DIE_CMD. Component bsub Date Add support to expand the allremote keyword that appears in the HOST column of the bmgroup output. By expanding allremote, bmgroup displays leased-in hosts from other clusters instead of the allremote keyword. To enable this feature, define LSB_BMGROUP_ALLREMOTE_EXPAND=Y in the appropriate configuration file: To enable "allremote" to be expanded for all users, edit lsf.conf and define LSB_BMGROUP_ALLREMOTE_EXPAND=Y. To only enable "allremote" to be expanded for a specific user, specify LSB_BMGROUP_ALLREMOTE_EXPAND=Y as an environment variable in the user's local environment before issuing the command. bmgroup

38 Date For Red Hat Enterprise Linux (RHEL) version 6.6 Beta and later, there is a MemAvailable area in /proc/meminfo. If there is MemAvailable, read this value directly from /proc/meminfo for the available memory load indicator instead of calculating the value. Component lim Date This enhancement allows the system to kill the job using the most CPU if the average logic CPU r15m value and the UT value both reach a configured threshold on the host. This allows other jobs on the host to run smoothly. A job is considered the worst CPU offending job on a host if it is using the most CPU (system time + user time) for an average assigned slot during the check period. When one job is killed as worst CPU offending job, the exit reason is the same as when a job's normal CPU limit is reached: "job killed after reaching LSF CPU usage limit" This solution is configured through a new configuration parameter in lsf.conf: Syntax LSB_CPU_USAGE_ENF_CONTROL=<Average Logic CPU r15m Threshold>:<UT Threshold>:<Check Interval> 1) Average Logic CPU r15m Threshold: A threshold value for the maximum limit for the quotient of host lsload command' r15m value and the count of host logic CPU. This means the average CPU queue length during the last 15 minutes for one logic CPU on the host. It must be a floating-point number, equal to or bigger than zero (0). For example, 7.8, 2.1, 0.9, and so on. 2) UT Threshold: A threshold for the maximum limit of the host lsload command's UT value. The UT value is the CPU utilization exponentially averaged over the last minute, between 0 and 1. It must be a floating-point number between 0 and 1. For example, 0.4, 0.5, or ) Check Interval: The smallest period of time during which the host's r15m and UT information will not be checked between two close checking cycles. This value must be not less than the value of SBD_SLEEP_TIME and the unit is in

39 seconds. For example, 20, 40, or 60. 4) The host is considered to be in CPU overload when <Average Logic CPU r15m Threshold> and <UT Threshold> have both been reached. 5) This parameter does not affect jobs running across multiple hosts. Default Not defined Component sbatchd Date Component LSF s global fairshare scheduling policy divides the processing power of Platform MultiCluster (MultiCluster) and the LSF/XL feature of Platform LSF Advanced Edition among users to provide fair access to all resources, so that every user can use the resources of multiple clusters according to their configured shares. Global fairshare is supported in Platform LSF Standard Edition and Platform LSF Advanced Edition. mbatchd sbatchd mbschd gpolicyd badmin bgpinfo bqueues schmod_advrsv.so schmod_affinity.so schmod_aps.so schmod_bluegene.so schmod_cpuset.so schmod_craylinux.so schmod_crayx1.so schmod_dc.so schmod_default.so schmod_dist.so schmod_fairshare.so schmod_fcfs.so schmod_jobweight.so schmod_limit.so schmod_mc.so schmod_parallel.so schmod_preemption.so schmod_pset.so schmod_ps.so schmod_reserve.so schmod_rms.so schmod_xl.so libbat.a libbat.so liblsf.a liblsf.so lsbatch.h Date Add support to show the settings for pending time, interactive jobs, exclusive jobs, and run time limit by either running bjobs -o pend_time, bjobs -o interactive, bjobs -o exclusive, bjobs -o runtimelimit/rtlimit or by adding pend_time, interactive, exclusive, runtimelimit/rtlimit to LSB_BJOBS_FORMAT in lsf.conf. For example: bjobs -o jobid pend_time interactive exclusive runtimelimit JOBID PEND_TIME INTERACTIVE EXCLUSIVE RUNTIMELIMIT 1 20 Y N 100.0/host N Y - 1. For a pending job, the PEND_TIME is the current time minus the job s submission time.

40 2. For a dispatched (running and suspending) job, the PEND_TIME is the job s start time minus the job s submission time. 3. For a re-queued, migrated, or rerun job, the PEND_TIME is the current time (re-dispatched time) minus the job s re-queued, migrated, or rerun time. 4. Jobs that are submitted with the following bsub options are treated as interactive jobs. -I, -Ip, -Is, -IS, -ISp, -ISs, -IX. 5. bjobs -o exclusive shows Y for jobs that are submitted with the -x option, a compute unit exclusive request, or an affinity exclusive request. 6. The RUNTIMELIMIT is the merged value of job level run time limit assignment, the application level run time limit setting and the queue level run time limit setting. If ABS_RUNLIMIT is enabled, the RUNTIMELIMIT is not normalized by the host CPU factor. 7. For IBM Platform LSF MultiCluster ("MultiCluster") with the job level run time limit specified, "bjobs -o runtimelimit" shows the normalized run time on both the submission cluster and the execution cluster. Defining the run time limit at the application or queue level in the submission cluster does not affect the job s run time on the job execution cluster, so defining it in the submission cluster is meaningless. However, when defining the run time limit at the application or queue level in the submission cluster, running "bjobs -o runtimelimit" in the submission cluster still shows the combined run time limit of the submission cluster as being different from the effective run time limit at the execution cluster, while running "bjobs -o runtimelimit" in the execution cluster shows the effective run time limit. bjobs Date Improvements to job chunking to address the following issues: A job's running time is not always predicable at the time of its submission. If such jobs are chunked but actually run for a very long time, other jobs in the same chunk are blocked in the chunk and wait for the long running job to finish. There is no way to reschedule these waiting jobs even if there is enough free resources. Traditional LSF job chunking will always chunk jobs together regardless of whether those jobs can run without being chunked. In some scenarios this will impact the resource utilization. In lsb.queues or lsb.applications, configure the new parameter CHUNK_MAX_WAIT_TIME together with CHUNK_JOB_SIZE on some queues or application profiles.

41 Syntax CHUNK_MAX_WAIT_TIME = <seconds> Component If a job is in WAIT status for longer than the configured time period, LSF removes the job from the job chunk and reschedules the job. The LSF scheduler ensures that such jobs are run instead of being chunked as a waiting member again when there are eligible resources. The application profile settings override queue-level configuration. Note: After a chunk job's waiting time exceeds CHUNK_MAX_WAIT_TIME, it may continue in WAIT status for one or more SBD_SLEEP_TIME cycles before being rescheduled. This is because sbatchd checks the timeout periodically, and the checking might be delayed if sbatchd is busy handling requests from mbatchd. In lsb.params, configure the new parameter ADAPTIVE_CHUNKING=Y to enable this feature. Note: This feature is not supported in the backfill and preemption phase in LSF bapp badmin bhist bjobs bparams bqueues sbatchd mbatchd mbschd schmod_default.so schmod_parallel.so schmod_fairshare.so schmod_affinity.so schmod_advrsv.so schmod_dc.so P Date Enhancements to LSF when running in the Linux x64 environment: 1. elim.gpu.ext reports GPU utilization. a) elim.gpu.ext reports the utilization for each GPU on the host. b) elim.gpu.ext reports the average utilization for all GPUs in shared mode on the host. 2. Optimize GPU allocation policies inside sbatchd. a) For exclusive mode GPUs, try to allocate GPUs from the same NUMA node as the cores (best effort) at runtime for affinity jobs. If there are multiple GPUs on multiple PCI buses in one NUMA node, LSF considers PCI bus information after considering affinity between CPU and GPU, then attempts to allocate GPUs from the same PCI bus (best effort). b) For exclusive mode GPUs, try to allocate GPUs from the same PCI bus (best effort) at runtime for non-affinity jobs. c) For shared mode GPUs, allocate shared mode GPUs to jobs by using round robin distribution. 3. Display which GPUs have been allocated to a job via bpost. For more details on configuring this service patch, refer to the README file inside

The following bugs have been fixed in LSF Version Service Pack 2 between 30 th May 2014 and 31 st January 2015:

The following bugs have been fixed in LSF Version Service Pack 2 between 30 th May 2014 and 31 st January 2015: The following bugs have been fixed in LSF Version 9.1.3 Service Pack 2 between 30 th May 2014 and 31 st January 2015: 211873 Date 2013-7-19 1. When advance reservation files exist (lsb.rsv.id, lsb.rsv.stat),

More information

Fixed Bugs for IBM Platform LSF Version 9.1.3

Fixed Bugs for IBM Platform LSF Version 9.1.3 Fixed Bugs for IBM LSF Version 9.1.3 Release Date: July 31 2014 The following bugs have been fixed in LSF Version 9.1.3 between 8 October 2013 and 21 July 2014: 223287 Date 2013-12-06 The preemption calculation

More information

Fixed Bugs for IBM Platform LSF Version

Fixed Bugs for IBM Platform LSF Version Fixed Bugs for IBM LSF Version 9.1.1.1 Release Date: July 2013 The following bugs have been fixed in LSF Version 9.1.1.1 since March 2013 until June 24, 2013: 173446 Date 2013-01-11 The full pending reason

More information

Fixed Bugs for IBM Spectrum LSF Version 10.1 Fix Pack 1

Fixed Bugs for IBM Spectrum LSF Version 10.1 Fix Pack 1 Fixed Bugs for IBM Spectrum LSF Version 10.1 Fix Pack 1 The following bugs have been fixed in LSF Version 10.1 Fix Pack 1 between 22 July 2016 and 20 October 2016: P101978 Date 2016-10-20 IBM Spectrum

More information

Release Notes for Platform LSF Version 7 Update 2

Release Notes for Platform LSF Version 7 Update 2 Release Notes for Platform LSF Version 7 Update 2 Contents Upgrade and Compatibility Notes on page 2 Release date: November 2007 Last modified: February 20, 2008 Comments to: doc@platform.com Support:

More information

IBM Spectrum LSF Version 10 Release 1. Release Notes IBM

IBM Spectrum LSF Version 10 Release 1. Release Notes IBM IBM Spectrum LSF Version 10 Release 1 Release Notes IBM IBM Spectrum LSF Version 10 Release 1 Release Notes IBM Note Before using this information and the product it supports, read the information in

More information

IBM Spectrum LSF Version 10 Release 1. Release Notes IBM

IBM Spectrum LSF Version 10 Release 1. Release Notes IBM IBM Spectrum LSF Version 10 Release 1 Release Notes IBM IBM Spectrum LSF Version 10 Release 1 Release Notes IBM Note Before using this information and the product it supports, read the information in

More information

Platform LSF Version 9 Release 1.2. Quick Reference GC

Platform LSF Version 9 Release 1.2. Quick Reference GC Platform LSF Version 9 Release 1.2 Quick Reference GC27-5309-02 Platform LSF Version 9 Release 1.2 Quick Reference GC27-5309-02 Note Before using this information and the product it supports, read the

More information

Platform LSF Version 9 Release 1.1. Foundations SC

Platform LSF Version 9 Release 1.1. Foundations SC Platform LSF Version 9 Release 1.1 Foundations SC27-5304-01 Platform LSF Version 9 Release 1.1 Foundations SC27-5304-01 Note Before using this information and the product it supports, read the information

More information

Running Jobs with Platform LSF. Version 6.0 November 2003 Comments to:

Running Jobs with Platform LSF. Version 6.0 November 2003 Comments to: Running Jobs with Platform LSF Version 6.0 November 2003 Comments to: doc@platform.com Copyright We d like to hear from you Document redistribution policy Internal redistribution Trademarks 1994-2003 Platform

More information

Improved Infrastructure Accessibility and Control with LSF for LS-DYNA

Improved Infrastructure Accessibility and Control with LSF for LS-DYNA 4 th European LS-DYNA Users Conference LS-DYNA Environment I Improved Infrastructure Accessibility and Control with LSF for LS-DYNA Author: Bernhard Schott Christof Westhues Platform Computing GmbH, Ratingen,

More information

Using Platform LSF Advanced Edition

Using Platform LSF Advanced Edition Platform LSF Version 9 Release 1.3 Using Platform LSF Advanced Edition SC27-5321-03 Platform LSF Version 9 Release 1.3 Using Platform LSF Advanced Edition SC27-5321-03 Note Before using this information

More information

Platform LSF Version 9 Release 1.3. Foundations SC

Platform LSF Version 9 Release 1.3. Foundations SC Platform LSF Version 9 Release 1.3 Foundations SC27-5304-03 Platform LSF Version 9 Release 1.3 Foundations SC27-5304-03 Note Before using this information and the product it supports, read the information

More information

Platform LSF Security. Platform LSF Version 7.0 Update 5 Release date: March 2009 Last modified: March 16, 2009

Platform LSF Security. Platform LSF Version 7.0 Update 5 Release date: March 2009 Last modified: March 16, 2009 Platform LSF Security Platform LSF Version 7.0 Update 5 Release date: March 2009 Last modified: March 16, 2009 Copyright 1994-2009 Platform Computing Inc. Although the information in this document has

More information

Using Docker in High Performance Computing in OpenPOWER Environment

Using Docker in High Performance Computing in OpenPOWER Environment Using Docker in High Performance Computing in OpenPOWER Environment Zhaohui Ding, Senior Product Architect Sam Sanjabi, Advisory Software Engineer IBM Platform Computing #OpenPOWERSummit Join the conversation

More information

Troubleshooting your SAS Grid Environment Jason Hawkins, Amadeus Software, UK

Troubleshooting your SAS Grid Environment Jason Hawkins, Amadeus Software, UK ABSTRACT A SAS Grid environment provides a highly available and resilient environment for your business. The challenge is that the more complex these environments become, the harder it can be to troubleshoot

More information

Installation Instructions for Platform Suite for SAS Version 9.1 for Windows

Installation Instructions for Platform Suite for SAS Version 9.1 for Windows Installation Instructions for Platform Suite for SAS Version 9.1 for Windows The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. Installation Instructions for Platform

More information

IBM Spectrum LSF Version 10 Release 1.0. Release Notes for IBM Spectrum LSF License Scheduler IBM

IBM Spectrum LSF Version 10 Release 1.0. Release Notes for IBM Spectrum LSF License Scheduler IBM IBM Spectrum LSF Version 10 Release 1.0 Release Notes for IBM Spectrum LSF License Scheduler IBM IBM Spectrum LSF Version 10 Release 1.0 Release Notes for IBM Spectrum LSF License Scheduler IBM Note Before

More information

Upgrading Platform LSF on UNIX and Linux

Upgrading Platform LSF on UNIX and Linux Upgrading Platform LSF on UNIX and Linux Contents Upgrade your LSF Cluster on page 2 Compatibility Notes on page 4 Get Technical Support on page 15 Version 6.2 February 2 2006 Platform Computing Comments

More information

Installation Instructions for Platform Suite for SAS Version 10.1 for Windows

Installation Instructions for Platform Suite for SAS Version 10.1 for Windows Installation Instructions for Platform Suite for SAS Version 10.1 for Windows The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2016. Installation Instructions for Platform

More information

LSF Reference Guide. Version June Platform Computing Corporation

LSF Reference Guide. Version June Platform Computing Corporation LSF Reference Guide Version 4.0.1 June 2000 Platform Computing Corporation Copyright First Edition June 2000 Copyright 1994-2000 Platform Computing Corporation All rights reserved. Printed in Canada Although

More information

HPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. A Quick Tour of IBM Platform LSF

HPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. A Quick Tour of IBM Platform LSF KFUPM HPC Workshop April 20-30 2015 Mohamed Mekias HPC Solutions Consultant A Quick Tour of IBM Platform LSF 1 Quick introduction to LSF for end users IBM Platform LSF (load sharing facility) is a suite

More information

Platform LSF Version 9 Release 1.2. Running Jobs SC

Platform LSF Version 9 Release 1.2. Running Jobs SC Platform LSF Version 9 Release 1.2 Running Jobs SC27-5307-02 Platform LSF Version 9 Release 1.2 Running Jobs SC27-5307-02 Note Before using this information and the product it supports, read the information

More information

LSF at SLAC. Using the SIMES Batch Cluster. Neal Adams. Stanford Linear Accelerator Center

LSF at SLAC. Using the SIMES Batch Cluster. Neal Adams. Stanford Linear Accelerator Center LSF at SLAC Using the SIMES Batch Cluster Neal Adams Stanford Linear Accelerator Center neal@slac.stanford.edu Useful LSF Commands bsub submit a batch job to LSF bjobs display batch job information bkill

More information

Management of batch at CERN

Management of batch at CERN Management of batch at CERN What is this talk about? LSF as a product basic commands user perspective basic commands admin perspective CERN installation Unix users/groups and LSF groups share management

More information

Laohu cluster user manual. Li Changhua National Astronomical Observatory, Chinese Academy of Sciences 2011/12/26

Laohu cluster user manual. Li Changhua National Astronomical Observatory, Chinese Academy of Sciences 2011/12/26 Laohu cluster user manual Li Changhua National Astronomical Observatory, Chinese Academy of Sciences 2011/12/26 About laohu cluster Laohu cluster has 85 hosts, each host has 8 CPUs and 2 GPUs. GPU is Nvidia

More information

Release Notes for Patches for the MapR Release

Release Notes for Patches for the MapR Release Release Notes for Patches for the MapR 5.0.0 Release Release Notes for the December 2016 Patch Released 12/09/2016 These release notes describe the fixes that are included in this patch. Packages Server

More information

Upgrading Platform LSF on UNIX

Upgrading Platform LSF on UNIX Upgrading Platform LSF on UNIX October 3 2002 Version 5.0 Platform Computing Comments to: doc@platform.com Contents Which Upgrade Steps to Use Upgrading an LSF Version 4.2 Cluster Installed with lsfinstall

More information

Using Platform LSF with FLUENT

Using Platform LSF with FLUENT Using Platform LSF with FLUENT November 2003 Platform Computing Comments to: doc@platform.com Platform LSF software ( LSF ) is integrated with products from Fluent Inc., allowing FLUENT jobs to take advantage

More information

Platform LSF Version 9 Release 1.1. Release Notes GI

Platform LSF Version 9 Release 1.1. Release Notes GI Platform LSF Version 9 Release 1.1 Release Notes GI13-3413-01 Platform LSF Version 9 Release 1.1 Release Notes GI13-3413-01 Note Before using this information and the product it supports, read the information

More information

Release Notes for Platform Process Manager. Platform Process Manager Version 8.1 January 2011 Last modified: January 2011

Release Notes for Platform Process Manager. Platform Process Manager Version 8.1 January 2011 Last modified: January 2011 Release Notes for Platform Process Manager Platform Process Manager Version 8.1 January 2011 Last modified: January 2011 Copyright 1994-2011 Platform Computing Corporation. Although the information in

More information

Platform LSF Version 9 Release 1.2. Security SC

Platform LSF Version 9 Release 1.2. Security SC Platform LSF Version 9 Release 1.2 Security SC27-5303-02 Platform LSF Version 9 Release 1.2 Security SC27-5303-02 Note Before using this information and the product it supports, read the information in

More information

Best practices. Using Affinity Scheduling in IBM Platform LSF. IBM Platform LSF

Best practices. Using Affinity Scheduling in IBM Platform LSF. IBM Platform LSF IBM Platform LSF Best practices Using Affinity Scheduling in IBM Platform LSF Rong Song Shen Software Developer: LSF Systems & Technology Group Sam Sanjabi Senior Software Developer Systems & Technology

More information

(MCQZ-CS604 Operating Systems)

(MCQZ-CS604 Operating Systems) command to resume the execution of a suspended job in the foreground fg (Page 68) bg jobs kill commands in Linux is used to copy file is cp (Page 30) mv mkdir The process id returned to the child process

More information

SMD149 - Operating Systems

SMD149 - Operating Systems SMD149 - Operating Systems Roland Parviainen November 3, 2005 1 / 45 Outline Overview 2 / 45 Process (tasks) are necessary for concurrency Instance of a program in execution Next invocation of the program

More information

Release Notes for IBM Platform License Scheduler

Release Notes for IBM Platform License Scheduler Platform LSF Version 9 Release 1.2 Release Notes for IBM Platform License Scheduler GI13-3414-01 Platform LSF Version 9 Release 1.2 Release Notes for IBM Platform License Scheduler GI13-3414-01 Note Before

More information

IBM Platform LSF. Best Practices. IBM Platform LSF and IBM GPFS in Large Clusters. Jin Ma Platform LSF Developer IBM Canada

IBM Platform LSF. Best Practices. IBM Platform LSF and IBM GPFS in Large Clusters. Jin Ma Platform LSF Developer IBM Canada IBM Platform LSF Best Practices IBM Platform LSF 9.1.3 and IBM GPFS in Large Clusters Jin Ma Platform LSF Developer IBM Canada Table of Contents IBM Platform LSF 9.1.3 and IBM GPFS in Large Clusters...

More information

TORQUE Resource Manager5.0.2 release notes

TORQUE Resource Manager5.0.2 release notes TORQUE Resource Manager release notes The release notes file contains the following sections: New Features on page 1 Differences on page 2 Known Issues on page 4 Resolved issues on page 4 New Features

More information

OPERATING SYSTEM. Chapter 9: Virtual Memory

OPERATING SYSTEM. Chapter 9: Virtual Memory OPERATING SYSTEM Chapter 9: Virtual Memory Chapter 9: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped Files Allocating Kernel Memory

More information

Operating System Concepts

Operating System Concepts Chapter 9: Virtual-Memory Management 9.1 Silberschatz, Galvin and Gagne 2005 Chapter 9: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped

More information

PROCESS CONTROL BLOCK TWO-STATE MODEL (CONT D)

PROCESS CONTROL BLOCK TWO-STATE MODEL (CONT D) MANAGEMENT OF APPLICATION EXECUTION PROCESS CONTROL BLOCK Resources (processor, I/O devices, etc.) are made available to multiple applications The processor in particular is switched among multiple applications

More information

Chapter 8: Virtual Memory. Operating System Concepts

Chapter 8: Virtual Memory. Operating System Concepts Chapter 8: Virtual Memory Silberschatz, Galvin and Gagne 2009 Chapter 8: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped Files Allocating

More information

Platform LSF concepts and terminology

Platform LSF concepts and terminology Platform LSF concepts and terminology Course materials may not be reproduced in whole or in part without the prior written permission of IBM. 7.0 Unit objectives After completing this unit, you should

More information

Contents. Error Message Descriptions... 7

Contents. Error Message Descriptions... 7 2 Contents Error Message Descriptions.................................. 7 3 4 About This Manual This Unify DataServer: Error Messages manual lists the errors that can be produced by the Unify DataServer

More information

High-availability services in enterprise environment with SAS Grid Manager

High-availability services in enterprise environment with SAS Grid Manager ABSTRACT Paper 1726-2018 High-availability services in enterprise environment with SAS Grid Manager Andrey Turlov, Allianz Technology SE; Nikolaus Hartung, SAS Many organizations, nowadays, rely on services

More information

Processes, PCB, Context Switch

Processes, PCB, Context Switch THE HONG KONG POLYTECHNIC UNIVERSITY Department of Electronic and Information Engineering EIE 272 CAOS Operating Systems Part II Processes, PCB, Context Switch Instructor Dr. M. Sakalli enmsaka@eie.polyu.edu.hk

More information

Release Notes for Platform Process Manager. Platform Process Manager Version 8.2 May 2012

Release Notes for Platform Process Manager. Platform Process Manager Version 8.2 May 2012 Release Notes for Platform Process Manager Platform Process Manager Version 8.2 May 2012 Copyright 1994-2012 Platform Computing Corporation. Although the information in this document has been carefully

More information

Platform LSF Desktop Support Administrator s Guide

Platform LSF Desktop Support Administrator s Guide Platform LSF Desktop Support Administrator s Guide Version 7 Update 2 Release date: November 2007 Last modified: December 4 2007 Support: support@platform.com Comments to: doc@platform.com Copyright We

More information

General Objectives: To understand the process management in operating system. Specific Objectives: At the end of the unit you should be able to:

General Objectives: To understand the process management in operating system. Specific Objectives: At the end of the unit you should be able to: F2007/Unit5/1 UNIT 5 OBJECTIVES General Objectives: To understand the process management in operating system Specific Objectives: At the end of the unit you should be able to: define program, process and

More information

Mid Term from Feb-2005 to Nov 2012 CS604- Operating System

Mid Term from Feb-2005 to Nov 2012 CS604- Operating System Mid Term from Feb-2005 to Nov 2012 CS604- Operating System Latest Solved from Mid term Papers Resource Person Hina 1-The problem with priority scheduling algorithm is. Deadlock Starvation (Page# 84) Aging

More information

Table of Contents. Table of Contents Pivotal Greenplum Command Center Release Notes. Copyright Pivotal Software Inc,

Table of Contents. Table of Contents Pivotal Greenplum Command Center Release Notes. Copyright Pivotal Software Inc, Table of Contents Table of Contents Pivotal Greenplum Command Center 3.2.2 Release Notes 1 2 Copyright Pivotal Software Inc, 2013-2017 1 3.2.2 Pivotal Greenplum Command Center 3.2.2 Release Notes Greenplum

More information

Processes. CS 475, Spring 2018 Concurrent & Distributed Systems

Processes. CS 475, Spring 2018 Concurrent & Distributed Systems Processes CS 475, Spring 2018 Concurrent & Distributed Systems Review: Abstractions 2 Review: Concurrency & Parallelism 4 different things: T1 T2 T3 T4 Concurrency: (1 processor) Time T1 T2 T3 T4 T1 T1

More information

Platform LSF Desktop Support User s Guide

Platform LSF Desktop Support User s Guide Platform LSF Desktop Support User s Guide Version 7.0 Update 2 Release date: November 2007 Last modified: December 4 2007 Support: support@platform.com Comments to: doc@platform.com Copyright We d like

More information

Release Notes for Platform LSF. Platform LSF Version 7.0 Update 6 Release date: September 2009 Last modified: September 1, 2009

Release Notes for Platform LSF. Platform LSF Version 7.0 Update 6 Release date: September 2009 Last modified: September 1, 2009 Platform LSF Version 7.0 Update 6 Release date: September 2009 Last modified: September 1, 2009 Contents Release Notes for Platform LSF... 3 Upgrade and Compatibility Notes... 3 What s Changed in Platform

More information

TORQUE Resource Manager Release Notes

TORQUE Resource Manager Release Notes TORQUE Resource Manager 5.1.3 Release Notes The release notes file contains the following sections: New Features on page 2 Differences on page 4 Known Issues on page 7 Resolved Issues on page 8 1 New Features

More information

SmartSuspend. Achieve 100% Cluster Utilization. Technical Overview

SmartSuspend. Achieve 100% Cluster Utilization. Technical Overview SmartSuspend Achieve 100% Cluster Utilization Technical Overview 2011 Jaryba, Inc. SmartSuspend TM Technical Overview 1 Table of Contents 1.0 SmartSuspend Overview 3 2.0 How SmartSuspend Works 3 3.0 Job

More information

IBM Platform LSF 9.1.3

IBM Platform LSF 9.1.3 IBM Platform LSF 9.1.3 Bill.McMillan@uk.ibm.com Global Product Portfolio Manager, IBM Platform LSF Family 1 IBM Platform LSF Family Key Drivers Unceasing demand for Compute Scalability and Throughput Node

More information

Using Platform LSF HPC Features

Using Platform LSF HPC Features Using Platform LSF HPC Features Version 8 Release date: January 2011 Last modified: January 10, 2011 Support: support@platform.com Comments to: doc@platform.com Copyright We d like to hear from you 1994-2011,

More information

Using Platform LSF HPC

Using Platform LSF HPC Using Platform LSF HPC Version 7 Update 5 Release date: March 2009 Last modified: March 13, 2009 Support: support@platform.com Comments to: doc@platform.com Copyright We d like to hear from you 1994-2009,

More information

THE PROCESS ABSTRACTION. CS124 Operating Systems Winter , Lecture 7

THE PROCESS ABSTRACTION. CS124 Operating Systems Winter , Lecture 7 THE PROCESS ABSTRACTION CS124 Operating Systems Winter 2015-2016, Lecture 7 2 The Process Abstraction Most modern OSes include the notion of a process Term is short for a sequential process Frequently

More information

Using Platform LSF MultiCluster. Version 6.1 November 2004 Comments to:

Using Platform LSF MultiCluster. Version 6.1 November 2004 Comments to: Using Platform LSF MultiCluster Version 6.1 November 2004 Comments to: doc@platform.com Copyright We d like to hear from you Document redistribution policy Internal redistribution Trademarks 1994-2004

More information

B. V. Patel Institute of Business Management, Computer &Information Technology, UTU

B. V. Patel Institute of Business Management, Computer &Information Technology, UTU BCA-3 rd Semester 030010304-Fundamentals Of Operating Systems Unit: 1 Introduction Short Answer Questions : 1. State two ways of process communication. 2. State any two uses of operating system according

More information

2/26/2017. For instance, consider running Word Count across 20 splits

2/26/2017. For instance, consider running Word Count across 20 splits Based on the slides of prof. Pietro Michiardi Hadoop Internals https://github.com/michiard/disc-cloud-course/raw/master/hadoop/hadoop.pdf Job: execution of a MapReduce application across a data set Task:

More information

Process management. What s in a process? What is a process? The OS s process namespace. A process s address space (idealized)

Process management. What s in a process? What is a process? The OS s process namespace. A process s address space (idealized) Process management CSE 451: Operating Systems Spring 2012 Module 4 Processes Ed Lazowska lazowska@cs.washington.edu Allen Center 570 This module begins a series of topics on processes, threads, and synchronization

More information

Kea Messages Manual. Kea Messages Manual

Kea Messages Manual. Kea Messages Manual Kea Messages Manual i Kea Messages Manual Kea Messages Manual ii Copyright 2011-2015 Internet Systems Consortium, Inc. Kea Messages Manual iii Contents 1 Introduction 1 2 Kea Log Messages 2 2.1 ALLOC Module....................................................

More information

CS 167 Final Exam Solutions

CS 167 Final Exam Solutions CS 167 Final Exam Solutions Spring 2018 Do all questions. 1. [20%] This question concerns a system employing a single (single-core) processor running a Unix-like operating system, in which interrupts are

More information

CS307: Operating Systems

CS307: Operating Systems CS307: Operating Systems Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building 3-513 wuct@cs.sjtu.edu.cn Download Lectures ftp://public.sjtu.edu.cn

More information

System Programming. Signals I

System Programming. Signals I Content : by Dr. B. Boufama School of Computer Science University of Windsor Instructor: Dr. A. Habed adlane@cs.uwindsor.ca http://cs.uwindsor.ca/ adlane/60-256 Content Content 1 Introduction 2 3 Signals

More information

Ch 4 : CPU scheduling

Ch 4 : CPU scheduling Ch 4 : CPU scheduling It's the basis of multiprogramming operating systems. By switching the CPU among processes, the operating system can make the computer more productive In a single-processor system,

More information

Batches and Commands. Overview CHAPTER

Batches and Commands. Overview CHAPTER CHAPTER 4 This chapter provides an overview of batches and the commands contained in the batch. This chapter has the following sections: Overview, page 4-1 Batch Rules, page 4-2 Identifying a Batch, page

More information

Univa Grid Engine Troubleshooting Quick Reference

Univa Grid Engine Troubleshooting Quick Reference Univa Corporation Grid Engine Documentation Univa Grid Engine Troubleshooting Quick Reference Author: Univa Engineering Version: 8.4.4 October 31, 2016 Copyright 2012 2016 Univa Corporation. All rights

More information

CSE 451: Operating Systems Winter Module 4 Processes. Mark Zbikowski Allen Center 476

CSE 451: Operating Systems Winter Module 4 Processes. Mark Zbikowski Allen Center 476 CSE 451: Operating Systems Winter 2015 Module 4 Processes Mark Zbikowski mzbik@cs.washington.edu Allen Center 476 2013 Gribble, Lazowska, Levy, Zahorjan Process management This module begins a series of

More information

Dr. Rafiq Zakaria Campus. Maulana Azad College of Arts, Science & Commerce, Aurangabad. Department of Computer Science. Academic Year

Dr. Rafiq Zakaria Campus. Maulana Azad College of Arts, Science & Commerce, Aurangabad. Department of Computer Science. Academic Year Dr. Rafiq Zakaria Campus Maulana Azad College of Arts, Science & Commerce, Aurangabad Department of Computer Science Academic Year 2015-16 MCQs on Operating System Sem.-II 1.What is operating system? a)

More information

Using Platform LSF on Windows. Version 6.0 February 2004 Comments to:

Using Platform LSF on Windows. Version 6.0 February 2004 Comments to: Using Platform LSF on Windows Version 6.0 February 2004 Comments to: doc@platform.com Copyright We d like to hear from you Document redistribution policy Internal redistribution Trademarks 1994-2004 Platform

More information

Programs. Program: Set of commands stored in a file Stored on disk Starting a program creates a process static Process: Program loaded in RAM dynamic

Programs. Program: Set of commands stored in a file Stored on disk Starting a program creates a process static Process: Program loaded in RAM dynamic Programs Program: Set of commands stored in a file Stored on disk Starting a program creates a process static Process: Program loaded in RAM dynamic Types of Processes 1. User process: Process started

More information

Introduction to High-Performance Computing (HPC)

Introduction to High-Performance Computing (HPC) Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit CPU cores : individual processing units within a Storage : Disk drives HDD : Hard Disk Drive SSD : Solid

More information

IRIX Resource Management Plans & Status

IRIX Resource Management Plans & Status IRIX Resource Management Plans & Status Dan Higgins Engineering Manager, Resource Management Team, SGI E-mail: djh@sgi.com CUG Minneapolis, May 1999 Abstract This paper will detail what work has been done

More information

OPERATING SYSTEMS CS3502 Spring Processor Scheduling. Chapter 5

OPERATING SYSTEMS CS3502 Spring Processor Scheduling. Chapter 5 OPERATING SYSTEMS CS3502 Spring 2018 Processor Scheduling Chapter 5 Goals of Processor Scheduling Scheduling is the sharing of the CPU among the processes in the ready queue The critical activities are:

More information

Tasks. Task Implementation and management

Tasks. Task Implementation and management Tasks Task Implementation and management Tasks Vocab Absolute time - real world time Relative time - time referenced to some event Interval - any slice of time characterized by start & end times Duration

More information

Source OID Message Severity Cause Action

Source OID Message Severity Cause Action 13 CHAPTER This section describes the Prime Network system events. System events appear in the Prime Network Events System tab. They include a variety of events pertaining to the system activities, from

More information

RELEASE NOTES. Version NEW FEATURES AND IMPROVEMENTS

RELEASE NOTES. Version NEW FEATURES AND IMPROVEMENTS S AND S Implementation of the Google Adwords connection type Implementation of the NetSuite connection type Improvements to the Monarch Swarm Library Column sorting and enhanced searching Classic trapping

More information

/6)%DWFK$GPLQLVWUDWRU V4XLFN 5HIHUHQFH

/6)%DWFK$GPLQLVWUDWRU V4XLFN 5HIHUHQFH /6)%DWFK$GPLQLVWUDWRU V4XLFN 5HIHUHQFH Version 3.2 3ODWIRUP&RPSXWLQJ&RUSRUDWLRQ /6)%DWFK$GPLQLVWUDWRU V4XLFN5HIHUHQFH Copyright 1994-1998 Platform Computing Corporation All rights reserved. This document

More information

The RWTH Compute Cluster Environment

The RWTH Compute Cluster Environment The RWTH Compute Cluster Environment Tim Cramer 29.07.2013 Source: D. Both, Bull GmbH Rechen- und Kommunikationszentrum (RZ) The RWTH Compute Cluster (1/2) The Cluster provides ~300 TFlop/s No. 32 in TOP500

More information

PBS PROFESSIONAL VS. MICROSOFT HPC PACK

PBS PROFESSIONAL VS. MICROSOFT HPC PACK PBS PROFESSIONAL VS. MICROSOFT HPC PACK On the Microsoft Windows Platform PBS Professional offers many features which are not supported by Microsoft HPC Pack. SOME OF THE IMPORTANT ADVANTAGES OF PBS PROFESSIONAL

More information

Running the model in production mode: using the queue.

Running the model in production mode: using the queue. Running the model in production mode: using the queue. 1) Codes are executed with run scripts. These are shell script text files that set up the individual runs and execute the code. The scripts will seem

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2017 Lecture 10 Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 Chapter 6: CPU Scheduling Basic Concepts

More information

Chapter 9: Virtual Memory

Chapter 9: Virtual Memory Chapter 9: Virtual Memory Silberschatz, Galvin and Gagne 2013 Chapter 9: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped Files Allocating

More information

Operating Systems Comprehensive Exam. Spring Student ID # 3/16/2006

Operating Systems Comprehensive Exam. Spring Student ID # 3/16/2006 Operating Systems Comprehensive Exam Spring 2006 Student ID # 3/16/2006 You must complete all of part I (60%) You must complete two of the three sections in part II (20% each) In Part I, circle or select

More information

CSE 410: Computer Systems Spring Processes. John Zahorjan Allen Center 534

CSE 410: Computer Systems Spring Processes. John Zahorjan Allen Center 534 CSE 410: Computer Systems Spring 2018 Processes John Zahorjan zahorjan@cs.washington.edu Allen Center 534 1. What is a process? Processes 2. What's the process namespace? 3. How are processes represented

More information

Installation Instructions for Platform Suite for SAS Version 7.1 for UNIX

Installation Instructions for Platform Suite for SAS Version 7.1 for UNIX Installation Instructions for Platform Suite for SAS Version 7.1 for UNIX Copyright Notice The correct bibliographic citation for this manual is as follows: SAS Institute Inc., Installation Instructions

More information

Cube Analyst Drive. Release Summary. Citilabs

Cube Analyst Drive. Release Summary. Citilabs Cube Analyst Drive Release Summary Cube Analyst Drive Release Summary Citilabs Cube Analyst Drive Release Summary This section documents changes included in each release of Cube Analyst Drive. You may

More information

Memory may be insufficient. Memory may be insufficient.

Memory may be insufficient. Memory may be insufficient. Error code Less than 200 Error code Error type Description of the circumstances under which the problem occurred Linux system call error. Explanation of possible causes Countermeasures 1001 CM_NO_MEMORY

More information

Admin Guide ( Unix System Administration )

Admin Guide ( Unix System Administration ) Admin Guide ( Unix System Administration ) ProFTPD Server Configuration ProFTPD is a secure and configurable FTP server, written for use on Unix and Unix-like operating systems. ProFTPD is modeled around

More information

Chapter 5: CPU Scheduling

Chapter 5: CPU Scheduling COP 4610: Introduction to Operating Systems (Fall 2016) Chapter 5: CPU Scheduling Zhi Wang Florida State University Contents Basic concepts Scheduling criteria Scheduling algorithms Thread scheduling Multiple-processor

More information

Radiometer AQT90 FLEX Troponin I ERROR CODES

Radiometer AQT90 FLEX Troponin I ERROR CODES Radiometer AQT90 FLEX Troponin I ERROR CODES The meter automatically performs electronic self-tests, i.e. environmental check ensures that the analyser is ready for testing and that consumables have not

More information

VMware vrealize operations Management Pack FOR. PostgreSQL. User Guide

VMware vrealize operations Management Pack FOR. PostgreSQL. User Guide VMware vrealize operations Management Pack FOR PostgreSQL User Guide TABLE OF CONTENTS 1. Purpose... 3 2. Introduction to the Management Pack... 3 2.1 How the Management Pack Collects Data... 3 2.2 Data

More information

IBM Spectrum LSF Version 10 Release 1.0. Using IBM Spectrum LSF License Scheduler IBM SCNN-NNNN-00

IBM Spectrum LSF Version 10 Release 1.0. Using IBM Spectrum LSF License Scheduler IBM SCNN-NNNN-00 IBM Spectrum LSF Version 10 Release 1.0 Using IBM Spectrum LSF License Scheduler IBM SCNN-NNNN-00 IBM Spectrum LSF Version 10 Release 1.0 Using IBM Spectrum LSF License Scheduler IBM SCNN-NNNN-00 Note

More information

Hadoop MapReduce Framework

Hadoop MapReduce Framework Hadoop MapReduce Framework Contents Hadoop MapReduce Framework Architecture Interaction Diagram of MapReduce Framework (Hadoop 1.0) Interaction Diagram of MapReduce Framework (Hadoop 2.0) Hadoop MapReduce

More information

Topic 4 Scheduling. The objective of multi-programming is to have some process running at all times, to maximize CPU utilization.

Topic 4 Scheduling. The objective of multi-programming is to have some process running at all times, to maximize CPU utilization. Topic 4 Scheduling The objective of multiprogramming is to have some process running at all times, to maximize CPU utilization. The objective of time sharing is to switch the CPU among processes so frequently.

More information

Mon Sep 17, 2007 Lecture 3: Process Management

Mon Sep 17, 2007 Lecture 3: Process Management Mon Sep 17, 2007 Lecture 3: Process Management September 19, 2007 1 Review OS mediates between hardware and user software QUIZ: Q: Name three layers of a computer system where the OS is one of these layers.

More information