Technical Guidelines for IATH Fellows


ABOUT THIS DOCUMENT

The Technical Guidelines are intended to help you take better advantage of IATH and UVA technical resources. If you can't find the information you need, send an e-mail to helpdesk@jefferson.village.virginia.edu.


1. MACHINE NAMES AND PASSWORDS

We will give you a list of the basic passwords and machine names that you need to know. You will be given a user id and default password for your jefferson account. For security, you should change your password as soon as possible.


2. EQUIPMENT

2.1. Legacy materials

If you have legacy machines or software that you wish to use, please talk to the staff about it. We will do our best, but, depending on the state of the old equipment and the nature of your project, you may be better off purchasing new machines and/or software.

2.2. Buying new equipment

IATH will help you select and purchase any new hardware and software that you need. Please note that there may be restrictions on some types of equipment purchases (such as materials for graduate students or for workers at outside institutions).

2.3. IATH hardware

IATH has two public machines, a PC and Mac, that can be used by fellows and research assistants. There is a notebook for reserving time on the PC on the table next to the machine, but otherwise both machines are available on a first come, first served basis. We also have two scanners, one of which is for slides, and two printers, one color and one black and white. You can reserve one of the IATH laptops (one PC and one Mac) and a portable video projector for remote presentations.

The IATH conference table can be reserved for meetings and such via Corporate Time. The table has a Corportate Time account as "IATH-319 table" and can be reserved for meetings. Note that it is listed under Resources. The table should be included when setting up "group agendas" for meetings here at IATH to avoid conflicts.

2.4. Library hardware

The UVa library and electronic centers have computer equipment, services, and technical support available for UVa faculty, students, and staff. The following may prove useful:

  • The Electronic Text Center (aka E-Text) home page is http://etext.lib.virginia.edu/. Click on "Services" for information about their resources.
  • The Geospacial and Statistical Data Center (aka Geostat) has a home page at http://fisher.lib.virginia.edu/. Click on "Staff and Services" to see a list of software and services.
  • The Digital Media Lab in the Robertson Media Center has several resources related to digital media and digital collections as well as help finding outside resources. The home page is http://www.lib.virginia.edu/clemons/RMC/dml.html.

If you need to use any of these resources, we can advise you as to whom you should contact.

2.5. Restrictions on use of UVa equipment

Please note that all IATH equipment is subject to UVa restrictions of equipment use (see http://www.virginia.edu/~polproc/pol/xvg2.html). Equipment that you purchase with money from IATH or grants is UVa property and therefore subject to these restrictions.


3. WEB SITE AND HOME DIRECTORY

You will be given a home directory and a project home page on the jefferson server. Your home directory will be /home/<user_id>. Your home page will be something like:

http://jefferson.village.virginia.edu/<project_name>/

The files will be on the jefferson server in your home directory in the /home/<user_id>/public_html/ directory.

Many fellows store project-related materials on their laptops or PCs and on jefferson. You should contact the System Administrator to be sure that your machine is backed up regularly so that you don't lose data.

3.1. Mark-up applications

If you are producing web materials as part of your project, you should have some familiarity with HTML. If you have not worked with HTML before, you should learn it. You may not be actually tagging in HTML but you should understand how an HTML document is put together and how different browsers display it. There are many books and on-line tutorials and references for HTML, such as Computer Hope's HTML Help, the Webmonkey HTML Basics, and the Web Design Group's tutorials. The UVa library also has short courses for faculty, staff, and students on the web, Photoshop, Flash, image processing, etc. The catalog of training programs is on-line.

3.2. Web applications

We have a relatively standard set of tools that we like to use for web development. If you are producing web materials and you need an application that we have never used before, we may not be able to give you normal technical support. You are not restricted to using IATH-approved applications, but you should be sure to talk to the technical staff before you start working with new software. We may be able to offer alternatives or to help you find other sources of technical support. The list here is the software we normally recommend.

  • Word processing. Word
  • Image processing. Photoshop
  • XML tagging and validation. NoteTabPro, XMetaL, BBedit (Mac)
  • XSL stylesheet composition. NoteTabPro
  • HTML design. Dreamweaver

3.3. Open source vs. proprietary software

When you are choosing software for your work, you should be aware of whether you are selecting open source or propriety software. Open source software is freely distributed, usually on the web, and its source code can be seen and edited by users. Proprietary software is usually distributed by private companies for a fee and its source code is not visible to users. All else being equal, IATH recommends that you use open source software when possible. Proprietary software depends on support from the company that produces it: if the company goes out of business or decides not to support the software, users can be left stranded with broken or outdated software. Software upgrades can also cause problems, if the upgrades are not backwards compatible (i.e., if they can not read data from previous versions). Open source software is, of course, subject to its own set of problems, since it may not have useful documentation or user support and it may have been adequately tested and debugged. But it often is a good source for solving specific problems that may not be covered by the more general demands of the software market.

There are other factors that should be considered: Microsoft's Word, for example, is proprietary but it is so common as to be a standard for word processing. Most people will expect you to be able to read and produce .doc files. It may also be wise to buy products from companies with a solid history. Adobe, for example, has produced high-quality reliable graphics software for many years.

For more information on the open source movement, see the Open Source Initiative.

3.4. About standards

Digital scholarship is still a work in progress: new applications, new solutions, and new possibilities appear regularly and offer opportunities to create better and more effective projects. However, works from the past few years are showing signs of their age and sometimes do not operate properly with current hardware and software. To try to avoid this, digital communities are developing and using standards for creating, delivering, and viewing web-based content. Community-based standards are designed to accomodate a group's particular interests and needs. Technical standards, such as those supported by the World Wide Web Consortium, are designed to give developers interoperable specifications, guidelines, and tools. We strongly advise fellows to study and follow appropriate standards when designing and building projects.


4. MAILING LISTS

Most IATH projects have at least one mailing list, which is archived to preserve the project's development history and to keep a record of discussions and decisions. We suggest that you create a list on the mailman list server. For help, please send a note to helpdesk@jefferson.village.virginia.edu.


5. FTP

File Transfer Protocal (FTP) is a common method of transferring files between your host server and another client host server. You will probably be using it regularly to upload and download files so you should consider the various methods available. There are various Windows applications, such as SecureFX, that give you an easy-to-use graphic interface for connecting to jefferson and other machines. Mac's FTP application is called Fetch. If you are comfortable working on the command line, you can also run ftp.

In cases, you will first connect to the client host (the machine that you want to connect to). Then you get files from the remote machine and put files on your local machine.

5.1. Windows

There are several Windows-based FTP applications available, all of which have their own interfaces. WS_FTP comes with the Windows installation, and is reasonably acceptable (and free) so you should feel free to use it. You can download it and get documentation from ITC's web site. WS_FTP uses the same commands as the command-line version but in a graphical user interface, so if you have never used FTP before, you may want to read through the description of the command-line version, above, to become familiar with it.

5.2. Mac

The predominant Mac FTP application is Fetch. It also has its own documentation and user help. You can download Fetch for free and get information about using it on the ITC web site. Please note that if you configure it yourself you should set it to a secure connection. If you have any questions, please talk to an IATH staff member.

5.3. Command line

You can start an FTP session from a DOS or Unix command line with the ftp command. The usage is:

ftp <server name>

You will then be asked for your user id and password. When you have successfully connected you will see "ftp>" at the command line. The process will look something like this:
$ ftp jefferson.village.virginia.edu 
Connected to jefferson.village.virginia.edu. FTP server 
(Version wu-2.6.0(1) Wed Oct 20 09:55:43 EDT 1999) ready. 
Name (jefferson.village.virginia.edu<none>): spw4s 
331 Password required for spw4s. Password: xxxxxx 
230 User spw4s logged in. 
ftp>

There are several commands that you can run in an ftp session. To see them all, type help after you have started a session. Note that a session may time-out after a given period of time, in which case you need to type quit and then start a new session.

The most important commands are put and get (Figure 1):

  • The put command copies files from your local host (the machine you are working on) onto the client host (the server or machine where you want the files to go).
  • The get command copies files from the client host onto your local host.

Figure 1

To use either one of these command, you'll need to know the name of the file(s) you want to put or get. You'll also need to know whether or not the file is a binary or ascii file. Binary files include image files and .exe files and must be sent in binary mode. Ascii files are text-based files, such as .txt or .pl files, and must be sent in ascii mode. A new session is set to ascii mode by default. To change to binary mode, type binary and hit enter. To go back to ascii mode, type ascii and hit enter.

Once you are in the correct mode, you can move your file(s).

 ftp> put foo.txt 

When you begin an FTP session, you are logged into your home directory of the client host. Type pwd (present working directory) to see the full path of this remote directory. To see a list the contents of the remote directory in the client host, type ls . To change your directory on the client host, use cd:

ftp> cd /mydirectory/stuff

To change your directory on your local machine, use lcd.

To exit an ftp session, type quit.


6. SECURITY

There are some basic security precautions that everyone should observe, such as guarding your equipment against theft, using strong passwords, maintaining up-to-date virus protection software, and backing-up data. This cannot be emphasized strongly enough: fellows have lost large chunks of data and entire hard drives to viruses.

The ITC web site has information on current security alerts and guidelines for safe computing at http://www.itc.virginia.edu/pubs/docs/RespComp/. If you think that your computer has been hacked or that you have a virus on your machine, please contact the IATH system staff as soon as possible. ITC also has a page on current virus alerts and fixes at http://www.itc.virginia.edu/desktop/virus/.

We will install Norton anti-virus software and back up your machine, but we encourage you and your staff to take reasonable steps to protect your data, such as not opening unknown attachments and not downloading suspect files, and to stay informed about new threats. If you have a Microsoft operating system, you will need to download critical Windows Updates promptly. For more information, see http://www.microsoft.com/security/default.mspx.


7. BACK-UPS

All IATH machines are backed-up nightly. We cannot back up machines that you have at home, but we schedule weekly back-up sessions for laptops. The IATH servers are backed up nightly by ITC. If you have questions about back-ups, please talk to Shayne.


8. TRAINING

IATH will provide training or help finding appropriate classes for you and your reseach assistants either through ITC and library short courses or with outside companies. Depending on your project, you may need to work on Unix, Windows, and/or Mac operating systems. You may also need to learn basic Unix commands, imaging and scanning, XML, and so on.

ITC Training Services Group offers a wide assortment of staff and student instruction and user support at http://www.itc.virginia.edu/training/. The library offers a series of short courses that cover web basics, Photoshop, Flash, image processing, and so forth. The catalog of courses is on-line at http://www.lib.virginia.edu/usered/catalog.html. A list of other training opportunities is on the library's web site at http://www.lib.virginia.edu/usered/programs.html.


9. HELP DESK

If you have a problem or question regarding your hardware, software, or other technical matters, please either send an e-mail to helpdesk@jefferson.village.virginia.edu or call the System Administrator. The sysadmin sees all queries and assigns them on to an appropriate staff member within an hour of receiving them (during regular business hours).

9.1. Rules for use

IATH helpdesk problems include malfunctioning IATH hardware or software, installation of new IATH hardware or software, resuscitation of IATH systems, and problems with IATH networking. If the solution requires a long-term solution, it should be scheduled through the Project Director, and not by going directly to staff members. If you have submitted a help desk problem by email, and you don't get a speedy and cheerful response with a thorough and effective follow-up, please mention it to the Director.

9.2. When to contact ITC

Some problems may require use of the ITC helpdesk. Problems that are like the problems listed above but involve software or hardware owned, distributed, or supported by ITC should go to the ITC helpdesk (call 924-3731 or go to http://www.itc.virginia.edu/helpdesk/). If you have a problem and you're not sure whether it's an ITC or an IATH problem, send e-mail to the IATH help desk. For more information and other ITC help, please see "Getting Help" from ITC, at http://www.itc.virginia.edu/help.html.

9.3. After-hour emergencies

If you run into an emergency situation that demands instant attention after regular business hours, contact the ITC emergency help desk (call 924-3731 or go to http://www.itc.virginia.edu/helpdesk/home.html#emergency).


10. OTHER KINDS OF HELP

If you have a non-technical problem or a query that doesn't require help desk, we will do our best to help you. The IATH staff does its best to help fellows as quickly and efficiently as possible. If we don't immediately know the answer to your question, we will help you find out. However, it will save time and energy if you go to the person most likely to have the information you want. In some cases, you may be better served by someone outside IATH.

Please note that a staff member may have a pressing assignment and may not be able to give you immediate attention. If this happens, please be patient!

10.1. Who knows what

Bernie Frischer General IATH policy issues; general funding and budget questions; selecting and purchasing equipment; planning and getting technical training; resolving problems with IATH staff; finding and training student employees; guidance with project development; legal issues
Dean Abernathy Visualization; digital cultural heritage; digital architectural work
Worthy Martin Implementation questions; general programming questions; help with echnical development; imaging
Daniel Pitti Project management issues; XML, XSL, and DTD help and development
Joy Shifflette Financial issues; tracking and submitting expenses; running projects; coordinating work and finances with other UVa and non-UVa groups; using help desk; payroll; finding the Director
Regina Carlson Fundraising; help preparing budgets and proposals; guiding proposals through the UVa machine; negotiating awards, and consulting on post-award administration and reporting
Shayne Brandon Questions about IATH server usage and Windows system administration; mailing lists; PHP; Unix system administration; back-up for laptops, Macs, and PCs; Perl
Robbie Bingler Selecting and developing specialized software for projects; Java; CGI; networking technologies; Unix
Kim Dylla Visualization; 3D graphics and animation software
Cindy Girard XML and XSL; Unix; DynaWeb; Perl; DTDs; Tamino; eXist; back-up person for system adminstration
Felicia Johnson Web design; Dreamweaver; Mac applications
Chad Keller Imaging and design work; mapping software; 3D modeling and animation; Flash; Mac applications and Mac system administration
Doug Ross PHP; CSS; JavaScript; XML and XSL
Sarah Wells Technical documentation, training materials, and reports; editing; XML and XSL; web-based documentation
ITC Jefferson system administration
E-Text Generating electronic text

10.2. ITC software and services

ITC has PC and Mac software available for downloading either for free or for a reduced price. You can find information about various operating systems, documentation, and pointers to free software at http://www.itc.virginia.edu/desktop/central/.

10.3. Central Mail Service (CMS)

CMS is the University's e-mail server. You are not required to have an e-mail account on CMS, but it is a good idea. CMS is reliable and has the added advantage of Web Mail (below). All UVA faculty, staff, and students are entitled to a CMS account. To get an account, go to http://www.itc.virginia.edu/helpdesk/accounts/createacct.html and click "Central Mail Service" (note that an account on Blue Unix is not the same as a CMS account: Blue Unix is for files and CMS is for e-mail).

10.4. Web Mail

Web Mail is a browser-based e-mail reader for CMS account holders. If you have a CMS mail account, you can read and send mail from that account from any web browser. Note that you must have a CMS account to use Web Mail (see above). For instructions for getting a CMS account and using Web Mail, please go to http://www.mail.virginia.edu/.


11. FILE MANAGEMENT

File management is a vitally important task and requires planning and regular maintenance. If not handled properly, you might overwrite or lose information, skip or repeat important tasks, misplace files, and tangle the entire project up in knots.

There are two major components that you need to pay attention to: file names and file control. File names are useful tools for tracking data, versions, and revision histories. File control is concerned with creating, copying, moving, deleting (etc.) files. In most cases, you will use Windows and Mac graphical interfaces tools, such as file folders and FTP applications, to manage files. You might also want or use a command-line interface (such as a DOS or SecureCRT window) to manage your files on jefferson or another server. In that case, you will need to know some basic commands and syntax for manipulating files.

11.1. File Names

Windows and Mac operating systems allow a great deal of flexibility with file names. Users can create very long file names with non-alphanumeric characters such as "!" and "#", and do not have to use a suffix (such as ".txt" or ".jpg"). While it may seem a good idea to call your file "draft#2," you may later not remember which draft the name refers to, what format the file is in, or why you chose to save this particular version. These kind of names can also cause you to lose data: if you transfer files between machines with a program like Fetch or SecureFX, the program may balk at non-alphanumeric characters or even corrupt the file. Unix operating systems do not like file name with spaces in them but they allow files names of up to 250 characters. Your applications may have trouble opening files without a suffix that identifies the file's format, even if the file is in fact the proper type.

To avoid these problems, observe good practice and name your files in a consistent and informative form. Use a name of no more than twenty alphanumeric characters and a suffix of three characters that indicate the file's format. You may also want to establish naming patterns that allow you and your project staff to remember where information is stored and what formats are used. For example, "jones2intro.xml" indicates the name of a work, the edition of the work, what portion of the work is contained in the file, and that the file is an XML file. It is also a good idea to avoid obscure or complex file names, such as "jk113004_124_2.gif". Not only is this kind of name hard to remember and difficult to use, but the content and version information in the name is unreasonably obscure. You will be much better off using a descriptive name like "menu_mainheader.gif" or "sonnets_1850.xml". Many applications allow spaces in file names (such as "old copy.doc") but not all: it is better to avoid them and use the "_" character instead (such as "old_copy.doc").

Note that in some cases (CD-burning software, for example) you may need to change your file names to fit an eight-character name and three-character suffix format.

11.2. Working on the command line

Before you start, you need a user id and password on the server that you will be working on. If you are working on jefferson, you will automatically be assigned an id and a default password (if you have not received this information, send a note to helpdesk).

You first must open a connection to the server. On a Windows machine, use SecureCRT (available on the ITC web site at http://www.itc.virginia.edu/desktop/securecrt/securecrt.html). On a Mac, open a Terminal window and type the ssh command with the server name. E.g.,

$ ssh jefferson.village.virginia.edu

When you are on the command line and logged into jefferson, you can navigate through directories, manipulate files, and run programs with simple commands such as ls, cp, and cd. Note that the instructions in this section are specific to jefferson and may not work as expected on other servers.

The remainder of this section will provide some basic information about managing files from the command line. More detailed information is available on the web in several places with various levels of detail. Two possible places to start are UNIXhelp for Users (http://unixhelp.ed.ac.uk/) and Webmonkey's Unix Guide (http://hotwired.lycos.com/webmonkey/reference/unix_guide/).

11.3. Documentation conventions

There are certain conventions used when describing command syntax. Anything between [ ] is optional and anything between { } is required. The "|" character indicates that you must use either one of two options. Anything between < > is a placeholder that indicates what information should go in that place. For example:

scrumble [-xy] {-z | -a} <file name1> <file name2>

The scrumble command has four options (-x, -y, -z, and -a) and two input sources. In order to run scrumble, you can use -x and/or -y; you must use either -z or -a (but not both); and you must include two filenames.

11.4. Man pages

Man pages are a built-in system of command-line documentation. Every command should have a man (manual) page, which describes what the command does, how to use it, and what options are available (options are also called arguments or flags).

To see a man page, enter the command man plus the command name. E.g., to see the man page for rm, you would enter:

$ man rm

You can also often get information by typing the command name with a -help or --help flag. For example:

$ rm --help
Usage: rm [OPTION]... FILE...
Remove (unlink) the FILE(s).

  -d, --directory       unlink directory, even if non-empty 
                        (super-user only)
  -f, --force           ignore nonexistent files, never prompt
  -i, --interactive     prompt before any removal
  -r, -R, --recursive   remove the contents of directories 
                        recursively
  -v, --verbose         explain what is being done
      --help            display this help and exit
      --version         output version information and exit

To remove a file whose name starts with a `-', for example `-foo',
use one of these commands:
  rm -- -foo

  rm ./-foo

Note that if you use rm to remove a file, it is usually possible to 
recover the contents of that file.  If you want more assurance that 
the contents are truly unrecoverable, consider using shred.

Report bugs to <bug-fileutils@gnu.org>.
The output will vary from one command to another. You may get no more than an brief line of syntax or a longer explanation of the command's options and some helpful suggestions.

You can also look in the O'Reilly Unix in a Nutshell reference guide in the IATH reference library (the shelves above the public machines). If the book isn't there, it is probably sitting on someone's desk, so ask around.

11.5. Terminology

If you need more help, there are many technical and computer dictionaries on-line (one is FOLDOC, at http://foldoc.doc.ic.ac.uk/foldoc/index.html).

  • command. A program that carries out a particular task.
  • execute. Start a program. An executable file is a file that contains code for running a program.
  • input. Data fed into a program when it is run.
  • option. Supplemental information given when running a command. Options can specify how a command should behave; where it should find its input and put output; when, where, and how long it should run; etc. They are also called arguments or flags. Unix commands sometimes have a bewildering array of possible options but usually you only need to know about a few in order to use the command.
  • output. Data that is returned after running a program. You can often specify whether or not the output is sent directly to a file or appears on the command line.
  • process. A copy of a program which is currently running. Note that while there may be only one copy of a program (i.e., one file that contains the code to run the program), you can usually run multiple processes from a single program at the same time.
  • regular expression. Used for pattern matching when looking for particular pieces of text. A regular expression is a pattern, made up of alphanumeric characters (such as "foo123") and special characters that further describe the pattern, such as all words that begin with the letter K or everything that falls between "BEGIN" and "END." You might use a regular expression to find and replace an outdated tag or to identify files that contain someone's name. Steve Ramsey wrote a useful explanation of regular expressions on the E-Text web site.
  • script. A program written in a scripting language (such as Perl). A script can be a text file that is run several times or a line of code executed on the command line.
  • server. A computer that holds files, runs programs, and/or performs other services for other computers that are connnected to it by a network.
  • shell. A shell (also called a wrapper) is the interface for the user to the operating system. It passes instructions from your command line to the operating system or server. When you enter commands on your command line you are working in a shell. The shell contains your environment settings, which tell the operating system who you are, what printer you use, where your home directory is, etc.
  • text file. A file that contains only text (rather than binary code). Perl scripts, HTML files, and *.txt files are all text files. Binary files, such as *.doc files, GIF files, and MP3 files, contain binary code and are unreadable by a human being.
  • user id. A unique name or number that has been assigned to a particular user so that the computer or network knows who that person is and what he or she is doing.

11.6. Basic command-line commands

The commands here are useful for file management and for controlling your project staff's use of the project files. As with any file management tools, you should be careful when moving or deleting files: while the IATH technical staff can fix many errors, you can lose or overwrite work that has not been backed up.

cat | cd | chmod | chown | cp | diff | emacs | find | grep | groups | id | kill | less | ln | lpr | ls | mkdir | more | mv | passwd | pico | ps | rm | sed | ssh | su | tar | whatis | whois

cat Short for "concatenate," cat lets you read multiple files. It reads each file in sequence and prints them out to the screen. For example:
$ cat file1.txt file2.txt 
This is file file1.txt 
This is file file2.txt 
You see the contents of the two files on your screen. You can also use the ">" character to copy the contents of files into a file. For example, if you enter:
$ cat file1.txt file2.txt > file3.txt
The cat command will copy the contents of file1.txt and file2.txt into a new files called file3.txt (however, if a file3.txt already exists in the working directory, it will overwrite it).
cd Use to move to another directory.
$ cd mydirectory/stuff
You can go back up a level by using "…"
$ cd ../anotherdirectory
To return to the previous directory, use "-".
$ cd -
If you aren't sure which directory you are currently in use the ls command.
chmod Change access permissions for a file, group of files, or a directory. Access permissions consists of read (r), write (w), and execute (x) permissions for the file owner, a group of users, and all others.You can see the current settings for a file or directory with ls -l (figure 2).

Figure 2

To change this file's permissions, you use chmod with options indicating which permissions you want to change. You can do this two ways: in absolute mode or symbolic mode. Absolute mode is more efficient but potentially confusing (see the Webmonkey site or the chmod man page for more information). Symbolic mode uses a "+" or "-" to add or remove permissions from the user (u), group (g), other (o), or to all (a). The syntax looks like this:

chmod [u, g, o, or a]{+ | -}{a | w | x} <file name>

For example, to give write permissions for the foo.txt file to anyone you would enter:
$ chmod a+w foo.txt
Notice that there are no spaces in the "g+w" option. To remove write permissions from all, you would simply change the "+" to a "-":
$ chmod a-w foo.txt
If you want to change permissions on every file in a directory, you can use the -R flag. For example, to give everyone permission to read all files in mydirectory you would enter:
$ chmod -R a+r mydirectory
chown Change the owner of one or more files or directories. You can use ls -l to see who currently owns the file (see figure 1, above, for a full explanation of the output). You must be logged in as the file's current owner or as root before you can change a file or directory's owner. The syntax is

chown <new owner> [: <new group>] <file name>

The <new owner> name is the new owner's user id (e.g., jmu2m). You can assign a new group as well (e.g., staff).

As with chmod , you can use the -R flag to change every file in a directory. You can also use the -h flag to change the ownership of a symbolic link: if you don't use it, the ownership of the referenced file is changed (see ln for more information about symbolic links).

cp Copy a file or directory.
$ cp file1.txt file2.txt 
Note that, in this example, if a file called file2.txt already existed in the current directory it will be overwritten.

You can also use this to copy a file into a different directory:

$ cp file1.txt /some/directory/somewhere/
This puts a copy of file1.txt in the /some/directory/somewhere/ directory.
diff Compares two files, line by line, and reports where there are differences. The syntax is

diff <file name1> <file name2>

So, suppose that you have two files, foo.txt:
This is file foo.txt 
It's a nice file
and bar.txt:
This is file bar.txt 
It's a nice file
but it has 3 lines of text.
If you run diff on these files you'll get the following results:
$ diff foo.txt bar.txt 
1c1 
< This is file foo.txt 
---
> This is file bar.txt 
2a3
> but it has 3 lines of text.
The first piece of output, 1c1, tells you that that line 1 in the two files is not the same. It then shows the two lines. The second piece, 2a3, tells you that the bar.txt has an extra line of text. If the files were identical, there would be no output.

This can be very useful for debugging and comparing small text files and scripts but the results can be confusing.

emacs Emacs is a word processing program that can be run from the command line. It is an excellent tool for working with text-based files but it may take some time to learn how to use it. It has a tutorial and help feature, which you may want to investigate. The tutorial is quite thorough and a good starting point. To run the program, just type emacs:
$ emacs

You can include a file name if you want to open an existing file. E.g.,

$ emacs foo.txt

You can get help with Ctrl-h (hold the Ctrl and "h" keys at the same time)and see the tutorial with Ctrl-h t (hit Ctrl-h, then hit "t"). If you have trouble, send a note to helpdesk@jefferson.village.virginia.edu.

While emacs is a good tool, there are other command-line word processing applications that you may find better suited to some jobs. For simple jobs involving only a few files, pico or WordPad (on Windows machines) may be more efficient. We do not recommend jove because of the line length restrictions. Desktop word processors such as Word and Frame are helpful for writing documents, although we don't recommend them for writing HTML pages or code.

find Use this to find particular groups of files and directories. This is a useful command to know. The syntax is

find <path name(s)> <condition(s)>

You have to provide at least one <path name> (i.e., a directory path) and <condition>. You can specify several path names to look in multiple directories. The <condition> option defines the terms of the search. There are several possible conditions, and you should look in Unix in a Nutshell or the find man page for a full discussion. The more useful conditions are -name, -newer, and -user.
-name <pattern> Find files whose name matches <pattern>.
-newer <file> Find files that have been modified after the file named in <file>. You must provide a full path for the file (i.e., if the file is not in your current directory you have to say where it is).
-user <user id> Find files belonging to the specified user id.
For example, to find all .jpg files in /mydirectory, you would enter this:
$ find /mydirectory -name *jpg
grep Search one or more file for a pattern (called a regular expression). The command returns the names of all files that have contents matching the regular expression, along with the line(s) which matches the pattern. The syntax is

grep <regular expression> [<directory or full or partial file name>]

You can use the "*" character to look for file names matching a certain pattern as well. It may take a bit of practice to learn to finetune your searches, since you may get matches that you didn't intend (searches for the word "red" for instance, will yield "retired", "adored", and "reduction").
* A wildcard that matches zero or more characters. For example, "foo*bar" matches "foobar", "foofbar", "fooBARbar", etc. If you wanted to look in all .txt files, for example, you could use "*.txt":
$ grep foo *txt
You can also just use a "*" to indicate that you want to look in all files in the current directory.
$ grep foo *
. A wildcard that matches any one character. For example,
$ grep f.le *txt
matches "file" and "filename" but not "fooled."
[ ] Use to indicate a possible range of characters, such as the numbers 1 through 9. For example:
$ grep foo[1234]bar 
foo1bar
foo2bar
foo3bar
You can search for the "*" and "." characters by escaping them with a "\". So:
$ grep foo[1234\*]bar 
foo*bar
foo1bar
foo2bar
foo3bar
groups Use to find out which group you or a specific user id belongs to. For example:
$ groups spw4s 
users staff samba
id This command is used for user identification. The syntax is

id [<user id>] [-a]

Used by itself, it returns the your current user id and the first group that you belong to.
$ id
uid=23304(spw4s) gid=100(users)
If you run it with an -a flag, it will show all groups that your user id belongs to.
$ id -a 
uid=23304(spw4s) gid=100(users) groups=100(users),
10(staff),518(samba)
You can run it with another user id to see which groups that id belongs to. E.g.,
$ id jmu2m
uid=21953(jmu2m) gid=254(systems)
kill Kill a process. The syntax is:

kill [-9] <one or more PIDs>

The -9 is an optional signal which essentially tells the command to kill the process no matter what.

To run this command, you need to know the targeted process's ID (PID). Run ps to see a list of which processes are currently running and their PIDs. For example,

$ ps
   PID TTY      TIME CMD
  2227 pts/37   0:00 emacs
 21385 pts/37   0:00 ksh

$ kill -9 2227 

$ ps
   PID TTY      TIME CMD
 21385 pts/37   0:00 ksh
[1] + Killed                   emacs &

In this example, we first run ps and see that there is an emacs process running. We don't want this process anymore, so it's killed. Notice that kill doesn't return a message to indicate whether or not it succeeded, so we run ps again to check that the process did indeed die.

Note that you can only kill your own processes.

less This lets you browse through one or more files. It is very similar to the more command, except that you can move backwards and forwards. To see a single file, enter less with the name of a file:
$ less foo.txt
Hit the space bar or the down arrow to move forward through the file and the up arrow to move backwards.

To see several files, you can enter the files names in sequence or use a regular expression to describe a pattern (such as *.txt to look at all .txt files).

$ less foo.txt bar.txt

When you get to end of the first file in the list, type :n to see the next file. At any point you can type :p to see the previous file or q to quit and get back to the command line.

ln Make links to remote files. There are two different kinds of links: hard and symbolic. A hard link is a pointer to a file which is actually located in a different directory and is indistinguishable from the orginal file. A symbolic link is an indirect pointer that only contains the file path of the original file. You can call that name to access the remote file. The syntax is

ln [-s]<file path> <link name>

The -s flag makes a symbolic link. For example:
$ ln -s /stuff/myfile.txt foobar.txt
Note that the link name foobar.txt is treated as a normal file, which may cause problems if you already have a file named foobar.txt (that is, you will lose your previous file). However, you can easily identify symbolic links when you run ls , since they have a "@" character next to their names:
$ ls  
foo.txt
foobar.txt@
bar.txt
Use ls -l to see where the link points to and information about the original file:
$ ls -l foobar.txt 
lrwxrwxrwx  1  spw4s users  13 Jul 26 12:02  
foobar.txt -> stuff/myfile.txt
lpr Use this to print a file from the command line. You may need to provide a printer name. The syntax is

lpr [-P<printer name>] <file name>

Note that there is no space between the -P and the printer name. The standard printer on jefferson is the black and white HP laser jet near the IATH conference table. Its name is av_l1, so to send a printing job there you would enter
$ lpr -Pav_l1 foobar.txt
If you do not specify a printer, the job will be sent to the default printer (which is probably av_l1).

You can check the status of print jobs from the command line with the lpq and lpstat tools. The first checks the queue status of

ls Lists the contents of the directory. There are several options available with this tool, but the more useful ones are -l and -a. The -l option displays the files in long format, meaning that it shows each file's permissions, owner, date last modified, and byte size (please see figure 1 for an example). The -a option lists every file in the directory, including those whose name begins with a dot (those are called dot files and are normally not listed). The syntax is:

ls [-l] [-a] <file or directory name>

mkdir This will create a new directory. For example,
$ mkdir newdirectory
This creates a directory called "newdirectory" in the current directory. Note that the default permissions settings in a new directory give everyone permission to read the contents of the directory but only the owner (you) has permission to write to the directory. You can use chmod to change permissions.
more This is very similar to the less command, except that it only lets you move forward through a file. To use, enter more along with the name of the file.
$ more foo.txt
Hit the space bar to move forward through the file.
mv Use to move files and directories. For example:
$ mv foo.txt /mydirectory
This moves foo.txt from the current directory to mydirectory. You could also move an entire directory:
$ mv /mydirectory /stuff/newdirectory
Note that mydirectory will no longer exist, but that all of its files will now be in newdirectory.

Be careful when moving multiple files, since the new files will overwrite existing files of the same name. The default setting of this command on jefferson will post a warning message if you are about to overwrite a file. You can use the -f flag to override this setting.

passwd Create or change your jefferson password. To use, just type in passwd. You will be prompted for your current password and then asked to type in the new one. Please note that this will not change your Windows NT server password.

The command will check your password to see if it can be guessed or is too close to your old password, so you may want to have a couple of possibilities in mind.

You can cancel this command midway through by hitting Ctrl-q.

pico Pico is a simple text editor and is very useful for simple editing or for small text files. If you are working with large files you would be better advised to use another text editing program, such as emacs or vi. We do not recommend jove as a text editor. The syntax is

pico [<file path>]

You can type in a new or existing file name: if you type a new name or no name pico will create a new file. A new file looks something like figure 3:

Figure 3

Type in the text as you would in any basic word processing application (such as Notepad). ou can use any of the commands listed in the footer by hitting the Ctrl key (displayed here as a "^" caret character) plus the specified letter. Hit Ctrl-a to move the cursor to the beginning of the line and Ctrl-e to move to the end of the line. Use Ctrl-d to delete characters. To cut and paste, use Ctrl-k to cut a line (hold down the Ctrl key and hit "k" multiple times to cut more than one line) and Ctrl-u to paste. To save to a file without closing the window, hit Ctrl-o then type in the file name. To insert a text file into the current window, hit Ctrl-r then type the file's path. To exit, hit Ctrl-x. If you haven't saved your work, you will be asked if you want to save.

Note that there is no man page for pico, but get to the pico help screen with Ctrl-g.

ps List information about processes currently running. You will see a list of processes that you own.
$ ps 
  PID TTY       TIME CMD 
20903 pts/62   0:00 emacs
20318 pts/62   0:00 ksh
This output shows that two processes are running. The PID column gives you each process's id (PIDs can be useful for tracking and killing individual processes), the TTY column identifies the terminal where the command's output is being displayed, the TIME column shows how long the process has been executing (if it is actively executing), and the CMD column tells you that name of the command.

There are several flags that you can use to change this list. The -f flags shows a full listing of all information about each process:

$ ps -f
  UID   PID  PPID  C    STIME TTY      TIME CMD
spw4s 20903 20318  0 10:42:05 pts/62   0:00 emacs
spw4s 20318 20283  0 10:37:49 pts/62   0:00 -ksh

The UID column is the user id of whoever owns the process (that is, whoever started it). You can ignore the PPID and C columns. STIME shows when you started the process.

You can use -a to see a longer list of programs currently running. If you use it with -f, you'll get a more complete picture.

$ ps -af
    UID   PID  PPID  C    STIME TTY      TIME CMD
   root 21967   670  0                   0:00 <defunct>
patrick 25013 24998  0 11:35:23 pts/39   0:00 /bin/bash
  sjm8k 21952 21941  0 10:56:17 pts/64   0:00 pine
  jjm2f  3863 11467  0                   0:42 <defunct>
   root 22279 20340  0 11:00:55 pts/63   0:00 -ksh
  mfr2v 24749 24733  0 11:31:38 pts/59   0:00 pine
  spw4s 20903 20318  0 10:42:05 pts/62   0:00 emacs
  spw4s 13069 13060  0 09:02:56 pts/34   0:01 pine
   root 25024 20318  0 11:35:27 pts/62   0:00 ps -af
   root  9022   670  0                   0:00 <defunct>
  kat2n 15143 15058  0   Aug 20 pts/30   0:01 pine
  jjm2f  7589  4884  0                   0:00 <defunct>
    pmc 15561 15543  0 09:51:27 pts/6    0:06 pine
  jjm2f  7035 11467  0                   0:45 <defunct>

Finally, you can use -A or -e to see the absolutely complete list of all processes currently running. This list includes processes that are not associated with a terminal (such as daemons).

rm The rm command can be used interchangeably with the rmdir command. This removes files and directories. Usage is:

rm <file/directory name(s)>

If you are removing multiple files or a directory you will be asked to confirm each file. E.g.,
$ rm foo*txt 
rm: remove `foo.txt'? y 
rm: remove `foobar.txt'? y 
Enter "y" for yes and "n" for no. This is intended to prevent accidental deletions, but it can be irritating and time-consuming if you are trying to delete several files. You can use the -f flag to remove all files without prompting.

You must have write permissions to the file(s) you are trying to remove.

sed This is an editing utility: the name is an abbreviation of "stream editor." It reads one or more text files and edits them according to your instructions. This is useful if you need to make a global change to several documents (such as a search and replace). The command looks at each line of the input files, copies it to an internal buffer (called a "pattern space"), applies the editing commands to the line (if appropriate), and prints the output to either the command line or to a file. Note that the output does not overwrite the original input files since sed is working in the pattern space.

You can give the editing instructions on the command line or in a separate script. You can tell it to search for lines with particular patterns. For example, you can look for all lines containing the word "automobile" and then replacing that word with the "car." Or to print all lines that between the words "START" and "END".

Unix in a Nutshell has a chapter devoted to sed and is reasonable place to start. If you want to start writing serious scripts you can also look at IATH's copy of sed & awk (another O'Reilly book), which also has some useful information about using regular expressions (part of pattern matching).

ssh Open a secure shell to another server. To use it, type ssh plus the name of the server. You will be asked for a password. To exit shell, type exit.
su Change user ids without having to log out. Your user id is the name you are logged in as. You might need to change to another user if you want to work with files that you don't have permission to use. You'll be asked for a password for the new id.
tar Copy files and directories to a single file, called a "tarfile". (Tar originally meant "tape archive" and was used to archive files.) You use the same command to open a tarfile. If you are tarring a directory with subdirectories, the file structure will be preserved in the tarfile. Note that when you open the file it will copy the structure into the file space where the tarfile is located.

The tar command has several options, which are described in full in the man page and the Unix in a Nutshell book. it is frequently used with the zip command, which compresses the file to a more manageable size. For best results, use the following syntax to create a tarfile:

$ tar -cvf <filename>.tar <file/directory name>
This creates a file called <filename>.tar. It does not delete the original file or directory. You now have a tarfile, but it is probably rather large (especially if you tarred an entire directory). You can use zip to compress the tarfile:
$ zip <filename>.tgz <tarfile name>
The .tgz suffix indicates that this a tarred and zipped file. To unzip and untar the file, use this syntax:
$ unzip <filename>.tgz 
$ tar -xvf <tarfile name>
This will extract and copy the contents of the tarfile. It will retain its same naming structure. The .tgz and .tar files are not deleted, but must removed by hand.

Tarfiles are a convenient form for moving directories and large sets of files via ftp , Fetch, etc. You can open them on Windows and Mac machines with the proper applications (e.g., WinZip).

whatis The whatis command displays a one-line summary of any documented Unix command (i.e., any command that has a man page). For example,
$ whatis ls 
cp     cp (1)   -   copy files
whois This is a UVa directory service for getting information about UVa users. If you enter a user id it will print out information from the UVa database. :
$ whois spw4s 
Name:                    Sarah P. Wells
Mailid/Handle:           spw4s
Unix Uid:                23304
Classification:          Staff
Department:              Rs-Inst Adv Tech Humanities
Office Address:          Alderman Library 3rd Fl W.
Office Phone:            (434) 924-4527
                         (434) 924-4370
Registered E-Mail Addr:  spw4s@Virginia.EDU
                         spw4s@s.mail.virginia.edu
If you don't know the person's user id you can provide part of the name (e.g., whois wells). Note, though, that if you are looking up a common name you'll be asked for more information.
$ whois wells

There are 57 database matches to your search key 'wells'.
Please try another search key.  See the manual for whois 
if necessary. You can use the "whois -a" option to list 
all database matches. See also: "whois help".
In this case, you'll need to provide more of the name:
$ whois "wells, sar"

Name:                    Sarah P. Wells
Mailid/Handle:           spw4s
Unix Uid:                23304
Classification:          Staff
Department:              Rs-Inst Adv Tech Humanities
Office Address:          Alderman Library 3rd Fl W.
Office Phone:            (434) 924-4527
                         (434) 924-4370
Registered E-Mail Addr:  spw4s@Virginia.EDU
                         spw4s@s.mail.virginia.edu

Other Guidelines documents: General and Administrative


© 2005 IATH at the University of Virginia. All rights reserved.