Unix Files, the File System, and Storage


Files and directories:


The general term file refers to a stream of bytes. In UNIX, files are used to contain user data, system data. A UNIX file system is the complete set of files managed in part with a hierarchical structure.

A unix file is an abstraction that represents anything from which data can be taken or to which data can be sent. Hence, a file may be something stored in secondary memory; but it can also refer to the varius input/output devises (keyboard, video display, printer, so on) that can provide or accept data.

Note: See our page on AccessControl for further details on permission bits.

Ownership and other attributes of files

When a file or directory is created, the file is assumed owned by the current user. The file's group access is assumed to be the user's group. Ownership can be changes with chown or chgrp. Two examples :

Unix maintains a set of attributes for each file or directory. New files will have attributes set based on a default which can be modified with the umask command.

Unix maintains a set of timestamps for files and directories

To see the last access time for a file, atime – use -lu options for ls. Use, ls -lc to show the last time the file was changed, ctime.

To see these times in greater precision, specify 'ls --full-time' (many different options depending on the OS, try ls -lT on a mac).

The stat commad can provide more detailed information on files and directories.

 

 

The UNIX file tree:


Organization of the filesystem tree on Ubuntu. A UNIX file system is the collection of all files installed along with the hierarchical structure. The top of the tree is root ('/').

 

File and Directory names


Every file and directory has a name. The name of your home directory is usually the same as yourlogin, and you normally cannot rename it. However, you must choose names for any otehr files and directories you make. On most UNIX systems, file names may comprise from one to 255 of the following characters, in any combination:

In most cases, you should avoid file names that contain spaces or any of the following special characters:

& * \ | [ ] { } $ < > ( ) # ? ' " / ; ^ ! ~ %

Also avoid using command names as file names

Hidden Files and Directories

A hidden file is one that is not listed when you use the simple ls command. A file or directory will be hidden if its name begins with a period. For example:

.hidden

.login

. (name for the current directory)

.. (name for the parent of the current directory)

would all be hidden. To list these files in a directory, including the hidden ones, requires ls-a (list all) command.

Renaming and Moving Files

The ls command takes one pathname; now consider a command that uses two. The mv (move) command has the general form:

mv pathname1 pathname2

This means "move the file found at pathname1 to the position specified by pathname2".

Creating and Copying Files

There are four common ways to create a UNIX file:

The cp (copy) command has the form:

cp pathname1 pathname2

This means "copy the file found at pathname1 and place the copy in the position specified by pathname2".

Creating a File by Redirection

The second method of creating a new file is to redirect the output of a command. In other words, instead of dispaying the result of the command on the screen, UNIX puts the results into a file. For example:

$ls > filelist

This time nothing appears on the screen because the output was rerouted into the file. If you want to add something to the end of this file:

$ls >>filelist

Links

Although we have been saying that directory files contain other files and directoried, that is not precisely true. If you could look inside a directory, you would find no files. Instead, you would see a list of the files that are supposed to be "contained" in that directory. The names on the list refer to the storage locations that actually hold the files. We say that the files are "linked" to the directory. Genrally, a link is a name that refers to a file. UNIX allows more than one link to the same file, so a file can have more than one name. Directory files always contain at least two links: (.), which is a link to the current directory itself, and (..) a link to the parent directory. Most ordinary files are created with just one link. You can create more links to a file using the ln (link) command:

$ln filename newfilename

where filename is the name of an existing file, and newfilename is the new name you want to liink to the file.

The Long Listing

The UNIX operating system is designed to make it easy for users to share files. However, there are times when you do not want others to copy, move or even examine the contents of your files and directories. You can easily control access to the files in your home directory. The ls-l command shows the current access permissions on a file or directory. Let's decipher an example:

drwxrwx- - - 2 you engr 512 Apr 1 15:33 Cal

Refer to our discussion on Access Control for further details related to allowing user, group, world access as well as how programs such as sudo which can temporarily promote a users access rights to run system commands.

Links

The ln command allows files to be included in directories that are linked to another file. A hard link, ' allows any number of directories to reference the same physical file. This is an efficient method to make a file available at different locations but in an efficient manner as there is only one real copy of the data. All files that are linked (with a hard link) to one physical file will have the same inode number.

Although we have been saying that directory files contain other files and directories, that is not precisely true. If you could look inside a directory, you would find no files. Instead, you would see a list of the files that are supposed to be "contained" in that directory. The names on the list refer to the storage locations that actually hold the files. We say that the files are "linked" to the directory.

Generally, a link is a name that refers to a file. UNIX allows more than one link to the same file, so a file can have more than one name. Directory files always contain at least two links: (.), which is a link to the current directory itself, and (..) a link to the parent directory. Files can be linked using the the ln (link) command. As an example, in a directory, lnEx, there are two subdirectories (myDirA and myDirB) each with a single file (fA.txt and fB.txt). In the lnEx directory, we create two files, myF1.txt and myF2.txt, that are linked to myDirA/fA.txt and myDirB/fB.txt respectively. We then issue an ls with -i which will show the inode numbers.

>ln ./myDirA/fA.txt myF1.txt; ln ./myDirB/fB.txt myF2.txt;

ls -iRlt .
.:
total 12
948184 lrwxrwxrwx 1 jjm jjm 15 Jan 30 01:56 myF2.txt -> ./myDirB/fB.txt
947190 drwxrwxr-x 2 jjm jjm 4096 Jan 30 01:47 myDirA
1961 drwxrwxr-x 2 jjm jjm 4096 Jan 30 01:47 myDirB
947191 -rw-rw-r-- 2 jjm jjm 7 Jan 30 01:47 myF1.txt

./myDirA:
total 4
947191 -rw-rw-r-- 2 jjm jjm 7 Jan 30 01:47 fA.txt

./myDirB:
total 4
1962 -rw-rw-r-- 1 jjm jjm 7 Jan 30 01:47 fB.txt

Next, If we remove both original files : >rm ./myDirA/fA.txt; rm ./myDirB/fB.txt; ls -iRlt

ls -iRlt
.:
total 12
1961 drwxrwxr-x 2 jjm jjm 4096 Jan 30 02:00 myDirB
947190 drwxrwxr-x 2 jjm jjm 4096 Jan 30 01:59 myDirA
948184 lrwxrwxrwx 1 jjm jjm 15 Jan 30 01:56 myF2.txt -> ./myDirB/fB.txt
947191 -rw-rw-r-- 1 jjm jjm 7 Jan 30 01:47 myF1.txt

./myDirB:
total 0

./myDirA:
total 0

For hard links, all new links to a file cause a link count to increment (and the count is decremented as they are removed). In the example, we removed one reference to the file, the linked file myF1.txt still exists and points to the original file data. Hard links have several limitations. First, files can only be linked in the same filesystem. Second, directories can not be linked. Symbolic links ait does not allow directories to be linked. Symolic links address these issues by supporting the following: 1)links that cross partitions and filesystems; 2)links to files that might be removed but are restored;

In the example, by removing the original file, the linked file remains, however it points to a non-existant file. The linked file can be accessed and appended with data.

 

The Filesystem

A filesystem manages all aspects of obtaining and saving data. The Unix filesystem defines a standard hierarchical structure or interface of user and system data to users as well as an operating system abstraction defining how application level programs can access resources managed by the operating system. The Unix mount command allows a filesystem to be comprised of smaller pieces, usually each uniquely mapped to a portion of the filesystem tree. Typically these 'smaller pieces' are filesystem resources provided by different physical storage devices.

Related and useful commands or tools

The following are related to managing files and filesystems in Linux.

 

finding files

Storage: Chapter 20

last update : 3/5/2017