What are the three data anomalies that are likely to occur as a result of data redundancy? Can data redundancy be completely eliminated in database approach? Why or why not?
Ans: The three sorts of anomalies that can take place in the database due to redundancy are insertion, deletion and modification or updation anomalies. Refer a relation emp_dept along with attributes: E#, Ename, Address, D#, Dname, Dmgr# with the primary key as E#.
Insertion anomaly: Let us suppose that a new department has been started by the organization but basically there is no employee appointed for that department, after that the tuple for this department cannot be inserted into this table as the E# will have NULL, which is not permitted as E# is primary key. This type of a problem in the relation in which some tuple cannot be inserted is termed as insertion anomaly.
Deletion anomaly: Now refer there is just only one employee in some department and that employee leaves the firm, after that the tuple of that employee has to be deleted from the table, but additionally to that the information about the department as well will get deleted. This type of a problem in the relation where deletion of some tuples can lead to loss of some other data not intended to be eliminated is termed as deletion anomaly.
Modification /update anomaly: Assume the manager of a department has changed, this needs that the Dmgr# in all the tuples corresponding to that department should be changed to reflect the new status. If we fail to update all the tuples of the given department, then two dissimilar records of employee working in similar department might depict different Dmgr# leading to contradiction in the database. This is termed as modification/update anomaly.
The data redundancy: it Cannot be totally removed from the database, although there should be controlled redundancy, for instance, consider a relation student_report(S#, Sname, Course#, SubjectName, marks) to store the marks of a student for a course comprising some optional subjects, but all the students should not chose similar optional papers. Now here the student name appears in every tuple, that is redundant and we can have two tables as students(S#, Sname, CourseName) and Report(S#, SubjectName, Marks). Though, if we want to print the mark-sheet for every student by using these tables then a join operation that is a costly operation, in terms of resources needed to perform, has to be performed to get the name of the student. Thus to save on the resource utilization, we might opt to store a single relation, students_report just only.