File Organization ~ Code Vidyalay

File Organization
It describes how records are stored within a file.A file is organised to ensure that records are available for processing. It should be designed with the activity and volatility information and the nature of storage media, Other consideration are cost of file media, enquiry, requirements of users and file’s privacy, integrity, security and confidentiality.

There are five file organization methods −

1. Sequential organisation
2. Indexed Sequential organisation
3. Inverted list organisation
4. Direct access organisation
5. Chaining

1. Sequential organization:
Sequential organization means storing and sorting in physical, contiguous blocks within files on tape or disk. Records are also in sequence within each block. To access a record previous records within the block are scanned. In a sequential organization, records can be added only at the end of the file. It is not possible to insert a record in the middle of the file without rewriting the file.

In a sequential file update, transaction records are in the same sequence as in the master file. Records from both the files are matched, one record at a time, resulting in an updated master file. In a personal computer with two disk drives, the master file is loaded on a diskette into drive A, while the transaction file is loaded on another diskette into drive B. Updating the master file transfers data from drive B to A controlled by the software in memory.

*Advantages:
i. Simple to design
ii. Easy to program
iii.Variable length and blocked records available
iv. Best use of storage space

*Disadvantages
i. Records cannot be added at the middle of the file.

2. Indexed sequential organization:
Like sequential organization, keyed sequential organization stores data in physicallycontiguous blocks. The difference is in the use of indexes to locate records. There are three areas in disk storage: prime area, overflow area and index area.

The prime area contains file records stored by key or id numbers. All records are initially stored in the prime area.

The overflow area contains records added to the file that cannot be placed in logical sequence in the prime area.

The index area is more like a data dictionary. It contains keys of records and their locations on the disk. A pointer associated with each key is an address that tells the system where to find a record.

*Advantages:
i. Indexed sequential organization reduces the magnitude of the sequential search and provides quick access for sequential and direct processing.
ii. Records can be inserted in the middle of the file.

*Disadvantages:
i. It takes longer to search the index for data access or retrieval.
ii.Unique keys are required.
iii.Periodic reorganization is required.

3. Inverted list organization:
Like the indexed- sequential storage method the inverted list organization maintains an index. The two methods differ, however, in the index level and record storage. The indexed sequential method has a multiple index for a given key, where as the inverted list method has a single index for each key type. In an inverted list, records are not necessarily stored in a particular sequence. They are placed in the data storage area, but indexes are updated for the record key and location. The inverted keys are best for applications that request specific data on multiple keys. They are ideal for static files because additions and deletions cause expensive pointer updating.

*Advantages
i.Used in applications requesting specific data on multiple keys.

Example:
Data for the flight reservation system.

The flight number, description and the departure time are as given as keys. In the data location area, no particular sequence is followed. If a passenger needs information about the Houston flight, the agent requests the record with Houston flight. The DBMS carries a sequential search to find the required record. The output will then be That the flight number is 170 departing at 10.10 A.M and flight number 169 departing at 8.15 A.M.

if the passenger searches for information about a Houston flight that departs at 8.15,then the DBMS searches the table and retrievesR3 and R6. Then it checks the flight departure time and retrieves R6 standing for flight number 169.

4. Direct access organization:
In direct access file organization, records are placed randomly throughout the file. Records need not be in sequence because they are updated directly and rewritten back in the same location. New records are added at the end of the file or inserted in specific locations based on software commands.
Records are accessed by addresses that specify their disk locations. An address is required for locating a record, for linking records, or for establishing relationships. Addresses are of two types:

i. Absolute
ii. Relative.

A absolute address represents the physical location of the record. It is usually stated in the format of sector/track/record number. One problem with absolute address is that they become invalid when the file that contains the records is relocated on the disk.

A relative address gives a record location relative to the beginning of the file. There must be fixed length records for reference. Another way of locating a record is by the number of bytes it is from the beginning of the file. When the file is moved, pointers need not be updated because the relative location remains the same.

*Advantages:
i. Records can be inserted or updated in the middle of the file.
ii.Better control over record allocation.

*Disadvantages:
i. Calculating address required for processing.
ii. Impossible to process variable length records.

5. Chaining:
File organization requires that relationships be established among data items. It must show how characters form fields, fields form files and files relate to each other. Establishing relationship is done through chaining. It uses pointers

Example: The file below contains auto parts that are an indexed sequential file sequenced by part no. A record can be retrieved by part no. To retrieve the next record, the whole file has to be searched. This can be avoided by the use of pointers.

Comparision

File Access
One can access a file using either Sequential Access or Random Access. File Access methods allow computer programs read or write records in a file.

Sequential Access
Every record on the file is processed starting with the first record until End of File (EOF) is reached. It is efficient when a large number of the records on the file need to be accessed at any given time. Data stored on a tape (sequential access) can be accessed only sequentially.

Direct (Random) Access
Records are located by knowing their physical locations or addresses on the device rather than their positions relative to other records. Data stored on a CD device (direct-access) can be accessed either sequentially or randomly.

Types of Files used in an Organization System
Following are the types of files used in an organization system −

Master file − It contains the current information for a system. For example, customer file, student file, telephone directory.

Table file − It is a type of master file that changes infrequently and stored in a tabular format. For example, storing Zipcode.

Transaction file − It contains the day-to-day information generated from business activities. It is used to update or process the master file. For example, Addresses of the employees.

Temporary file − It is created and used whenever needed by a system.

Mirror file − They are the exact duplicates of other files. Help minimize the risk of downtime in cases when the original becomes unusable. They must be modified each time the original file is changed.

Log files − They contain copies of master and transaction records in order to chronicle any changes that are made to the master file. It facilitates auditing and provides mechanism for recovery in case of system failure.

Archive files − Backup files that contain historical versions of other files.

Documentation Control
Documentation is a process of recording the information for any reference or operational purpose. It helps users, managers, and IT staff, who require it. It is important that prepared document must be updated on regular basis to trace the progress of the system easily.

After the implementation of system if the system is working improperly, then documentation helps the administrator to understand the flow of data in the system to correct the flaws and get the system working.

Programmers or systems analysts usually create program and system documentation. Systems analysts usually are responsible for preparing documentation to help users learn the system. In large companies, a technical support team that includes technical writers might assist in the preparation of user documentation and training materials.

Advantages
->It can reduce system downtime, cut costs, and speed up maintenance tasks.

->It provides the clear description of formal flow of present system and helps to understand the type of input data and how the output can be produced.

->It provides effective and efficient way of communication between technical and nontechnical users about system.

->It facilitates the training of new user so that he can easily understand the flow of system.

->It helps the user to solve the problems such as troubleshooting and helps the manager to take better final decisions of the organization system.

->It provides better control to the internal or external working of the system.

Types of Documentations
When it comes to System Design, there are following four main documentations −

1.Program documentation
2.System documentation
3.Operations documentation
4.User documentation

Program Documentation
It describes inputs, outputs, and processing logic for all the program modules.

The program documentation process starts in the system analysis phase and continues during implementation.

This documentation guides programmers, who construct modules that are well supported by internal and external comments and descriptions that can be understood and maintained easily.

Operations Documentation
Operations documentation contains all the information needed for processing and distributing online and printed output. Operations documentation should be clear, concise, and available online if possible.

It includes the following information −

->Program, systems analyst, programmer, and system identification.

->Scheduling information for printed output, such as report, execution frequency, and deadlines.

->Input files, their source, output files, and their destinations.

->E-mail and report distribution lists.

->Special forms required, including online forms.

->Error and informational messages to operators and restart procedures.

->Special instructions, such as security requirements.

User Documentation
It includes instructions and information to the users who will interact with the system. For example, user manuals, help guides, and tutorials. User documentation is valuable in training users and for reference purpose. It must be clear, understandable, and readily accessible to users at all levels.

The users, system owners, analysts, and programmers, all put combined efforts to develop a user’s guide.

A user documentation should include −

->A system overview that clearly describes all major system features, capabilities, and limitations.

->Description of source document content, preparation, processing, and, samples.

->Overview of menu and data entry screen options, contents, and processing instructions.

->Examples of reports that are produced regularly or available at the user’s request, including samples.

->Security and audit trail information.

->Explanation of responsibility for specific input, output, or processing requirements.

->Procedures for requesting changes and reporting problems.

->Examples of exceptions and error situations.

->Frequently asked questions (FAQs).

->Explanation of how to get help and procedures for updating the user manual.

System Documentation
System documentation serves as the technical specifications for the IS and how the objectives of the IS are accomplished. Users, managers and IS owners need never reference system documentation. System documentation provides the basis for understanding the technical aspects of the IS when modifications are made.

->It describes each program within the IS and the entire IS itself.

->It describes the system’s functions, the way they are implemented, each program's purpose within the entire IS with respect to the order of execution, information passed to and from programs, and overall system flow.

->It includes data dictionary entries, data flow diagrams, object models, screen layouts, source documents, and the systems request that initiated the project.

->Most of the system documentation is prepared during the system analysis and system design phases.

->During systems implementation, an analyst must review system documentation to verify that it is complete, accurate, and up-to-date, and including any changes made during the implementation process.

Code Vidyalay

30 Oct 2019

File Organization

Translate

Popular Posts

Computer Science Subjects

Followers

Contact Form

About