Wednesday 27 May 2015

Assignment 2


CHAPTER 6

 

 

 

ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT

Effective information system provides users with accurate, timely, and relevant information.

Information is timely when it is available to decision makers when it is needed. Information is relevant when it is useful and appropriate for the types of work and decisions that require it.   To understand the problem, let’s look at how information systems arrange data in computer files and traditional methods of file management.

 

FILE ORGANIZATION TERMS AND CONCEPTS

Computer system organizes data in a hierarchy that starts with bits and bytes and progresses to fields, records, files, and databases.  A bit represents the smallest unit of data a computer can handle. A group of bits, called a byte, represents a single character, which can be a letter, a.

 

A computer system organizes data in a hierarchy that starts with the bit, which represents either a 0 or a 1. Bits can be grouped to form a byte to represent one character, number, or symbol. Bytes can be grouped to form a field, and related fields can be grouped to form a record. Related records can be collected to form a file, and related files can be organized into a database. number, or another symbol.

 

A Field is the grouping of characters into a word, a group of words, or a complete number (such as a person’s name or age).

 

Record is a group of related fields, such as the student’s name, the course taken, the date, and the grade. a group of records of the same type is called a file.  A group of related files makes up a database.  An entity is a person, place, thing, or event on which we store and maintain information. Each characteristic or quality describing a particular entity is called an attribute.

Course, Date, and Grade are attributes of the entity COURSE. The specific values that these attributes can have are found in the fields of the record describing the entity COURSE.

 

PROBLEMS WITH THE TRADITIONAL FILE ENVIRONMENT

In most organizations, systems tended to grow independently without accompany-wide plan. Accounting, finance, manufacturing, human resources, and sales and marketing all developed their own systems and data files. program to operate.

 

The following are some of the problems facing the traditional file environment.

 

Data Redundancy and Inconsistency

Data redundancy is the presence of duplicate data in multiple data files so that the same data are stored in more than place or location. Data redundancy occurs when different groups in an organization independently collect the same piece of data and store it independently of each other. Data redundancy wastes storage resources and also leads to data inconsistency, where the same attribute may have different values.

 

 

Program-Data Dependence

Program-data dependence refers to the coupling of data stored in files and the specific programs required to update and maintain those files such that changes programs require changes to the data. Every traditional computer program has to describe the location and nature of the data with which it works. In a traditional file environment, any change in a software program could require a change in the data accessed by that program.

 

Lack of Flexibility

A traditional file system can deliver routine scheduled reports after extensive programming efforts, but it cannot deliver ad hoc reports or respond to unanticipated information requirements in a timely fashion. The information required by ad hoc requests is somewhere in the system but may be too expensive to retrieve.

 

Poor Security

Because there is little control or management of data, access to and dissemination of information may be out of control, Management may have no way of knowing who is accessing or even making changes to the organization’s data.

                                             

Lack of Data Sharing and Availability

Because pieces of information in different files and different parts of the organization cannot be related to one another, it is virtually impossible for information to be shared or accessed in a timely manner. Information cannot flow freely across different functional areas or different parts of the organization.

6.2

THE DATABASE APPROACH TO DATA MANAGEMENT

Database technology cuts through many of the problems of traditional file organization. A more rigorous definition of a database is a collection of data organized to serve many applications efficiently by centralizing the data and controlling redundant data. Rather than storing data in separate files for each application, data are stored so as to appear to users as being stored in only one location. A single database services multiple applications.

 

DATABASE MANAGEMENT SYSTEMS

Database is a collection of data organized to serve many applications efficiently by centralizing the data and controlling redundant data.

 

A database management system (DBMS) is software that permits an organization to centralize data, manage them efficiently, and provide access to the stored data by application programs. The DBMS acts as an interface between application programs and the physical data files. When the application program calls for a data item, such as gross pay, the DBMS finds this item in the database and presents it to the application program. Using traditional data files, the programmer would have to specify the size and format of each data element used in the program and then tell the computer where they were located.

 

How a DBMS Solves the Problems of the Traditional File Environment

A DBMS reduces data redundancy and inconsistency by minimizing isolated files in which the same data are repeated. The DBMS may not enable the organization to eliminate data redundancy entirely, but it can help control redundancy.  The DBMS enables the organization to centrally manage data, their use, and security.

Relational DBMS

Relational databases represent data as two-dimensional tables (called relations). Tables may be referred to as files. Each table contains data on an entity and its attributes. Eg. Microsoft Access is a relational DBMS for desktop.

 

Operations of a Relational DBMS

Relational database tables can be combined easily to deliver data required by users, provided that any two tables share a common data element.

.

Object-Oriented DBMS

DBMS designed for organizing structured data into rows and columns are not well suited to handling graphics based or multimedia applications. Object-oriented databases are better suited for this purpose.

An object-oriented DBMS stores the data and procedures that act on those data as objects that can be automatically retrieved and shared. Object-oriented database management systems (OODBMS) are becoming popular because they can be used to manage the various multimedia components or Java applets used in Web applications, which typically integrate pieces of information from a variety of sources.  Although object-oriented databases can store more complex types of information than relational DBMS, they are relatively slow compared with relational

 

Databases in the Cloud

Cloud computing providers offer database management services, but these services typically have less functionality than their on-premises counterparts.

 

CAPABILITIES OF DATABASE MANAGEMENT SYSTEMS

A DBMS includes capabilities and tools for organizing, managing, and accessing the data in the database. The most important are its data definition language, data dictionary, and data manipulation language.

 

DBMS have a data definition capability to specify the structure of the content of the database. It would be used to create database tables and to define the characteristics of the fields in each table. This information about the database would be documented in a data dictionary. A data dictionary is an automated or manual file that stores definitions of data elements and their characteristics.

 

Querying and Reporting

DBMS includes tools for accessing and manipulating information in databases.  Most DBMS have a specialized language called a data manipulation language that is used to add, change, delete, and retrieve the data in the database.

 

DESIGNING DATABASES

To create a database, you must understand the relationships among the data the type of data that will be maintained in the database, how the data will be used, and how the organization will need to change to manage data from a company-wide perspective. The database requires both a conceptual design and a physical design. The conceptual, or logical, design of a database is an abstract model of the database from a business perspective, whereas the physical design shows how the database is actually arranged on direct-access storage devices.

 

Normalization and Entity-Relationship Diagrams

Is the process of creating small, stable, yet flexible and adaptive data structures from complex groups of data?

 

DATA WAREHOUSES

A data warehouse is a database that stores current and historical data of potential interest to decision makers throughout the company. Data warehouse consolidates and standardizes information from different operational database so that the information can be used across the enterprise for management information.

 

 

Data Marts

A data mart is a subset of a data warehouse in which a summarized or highly focused portion of the organization’s data is placed in a separate database for a specific population of users.

 

TOOLS FOR BUSINESS INTELLIGENCE:

MULTIDIMENSIONAL DATA ANALYSIS AND DATA MINING

Business intelligence tools enable users to analyze data to see new patterns, relationships, and insights that are useful for guiding decision making.  Principal tools for business intelligence include software for database querying and reporting, tools for multidimensional data analysis (online analytical processing), and tools for data mining.

 

Online Analytical Processing (OLAP)

This enable users to view the same data in different ways using multiple dimensions. Each aspect

of information—product, pricing, cost, region, or time period—represents a different dimension.

 

Data Mining.

Data mining provides insights into corporate data that cannot be obtained with OLAP by finding hidden patterns and relationships in large databases and inferring rules from them to predict future behavior. The patterns and rules are used to guide decision making and forecast the effect of those decisions. The types of information obtainable from data mining include associations, sequences, classifications, clusters, and forecasts.

 

 

Text Mining and Web Mining

Text mining tools are now available to help businesses analyze these data. These tools are able to extract key elements from large unstructured data sets, discover patterns and relationships, and summarize the information.

 

Web mining is the discovery and analysis of useful patterns and information from the world.

 

 

MANAGING DATA RESOURCES

 

ESTABLISHING AN INFORMATION POLICY

An information policy specifies the organization’s rules for sharing, disseminating, acquiring, standardizing, classifying, and inventorying information.  Information policy lays out specific procedures and accountabilities, identifying which users and organizational units can share information, where information can be distributed, and who is responsible for updating and maintaining the information

 

Data administration is responsible for the specific policies and procedures through which data can be managed as an organizational resource. These responsibilities include developing information policy, planning for data, overseeing logical database design and data dictionary development, and monitoring how information systems specialists and end-user groups use data. 

 

Data governance used to describe many of these activities. Promoted by IBM, data governance deals with the policies and processes for managing the availability, usability, integrity, and security of the data employed in an enterprise, with special emphasis on promoting privacy, security, data quality, and compliance with government regulations, the logical relations among elements, and the access rules and security procedures. The functions it performs are called database administration.

 

ENSURING DATA QUALITY

This is to ensure that data in an organizational databases are accurate and reliable.  If a database is properly designed and enterprise-wide data standards established, duplicate or inconsistent data elements should be minimal.  Analysis of data quality often begins with a data quality audit, which is a structured survey of the accuracy and level of completeness of the data in an information system. Data quality audits can be performed by surveying entire data files, surveying samples from data files, or surveying end users for their perceptions of data quality.

 

Data cleansing, also known as data scrubbing, consists of activities for detecting and correcting data in a database that are incorrect, incomplete, improperly formatted, or redundant. Data cleansing not only corrects errors but also enforces consistency among different sets of data that originated in separate information systems.

 

 

 

 

No comments:

Post a Comment