Analyzing ELF files

Assignment Help Basic Computer Science
Reference no: EM133028182

Assignment - Analyzing ELF files

Analyzing ELF files

In this assignment you will use memory-mapped file I/O to open and read ELF files. skeleton projects are provided so that you can choose whether you want to write the program in C or C++.)

The code for the program is in a source file called magic.c executable called magic .

Requirements

These are the main requirements for your program:

2. It should determine whether the opened file is an ELF file, and if not, print Not an ELF file to standard output and exit normally

3. If it is an ELF file, it summarize the ELF header, sections, and symbols (as described below), and then exit normally

Note that ELF files for 32 bit systems are slightly different than ELF files for 64 bit systems. Also, ELF files can use little-endian or big-endian byte ordering depending on the machine type. Since handling these variations can be tricky, you can earn up to 99% of full credit by only supporting 64 bit little-endian ELF files, such as the ones produced on x86-64 Linux systems.

Memory-mapped file I/O
As we've discussed in lecture, the OS kernel uses pages of physical memory as a cache for data on mass storage devices (hard disks and SSDs). The mmap system call allows programs to map pages containing disk or SSD data into their own address space.

Let's say that you want to map the contents of an ELF file into memory using mmap . First, you'll need to use the

int fd = open(filename, O_RDONLY);

This call willl return afile descriptor for the opened file, or will return a negative value if the file can't be opened.

Next, the program will need to know how many bytes of data the file has. This can be accomplished by calling

struct stat statbuf;
int rc = fstat(fd, &statbuf);
if (rc != 0) {
// error
} else {
size_t file_size = statbuf.st_size; // ...
}

Once the program knows the size of the file, creating a private read-only mapping using mmap will allow the program to access the file contents in memory:

void *data = mmap(NULL, file_size, PROT_READ, MAP_PRIVATE, fd, 0);

If mmap returns a pointer value other than ((void *)-1) , the pointer value points to a region of memory in which the program can access the file contents.

Decoding ELF files

Here is a high-level summary:

The ELF header at the beginning of the ELF file data contains general information about the ELF file, and indicates the locations of the program headers and section headers

The section headers describe the layout of the sections within the ELF file

In the ELF header, the e_shstrndx field indicates which section is the .shstrtab section. This section is a string table that contains the section names. This is very important information, since the section names stored in the .shstrtab header will allow you to determine the identities of the other sections. It is especially important for your program to locate the .symtab and .strtab sections, since you will need to access data in these sections to find information about symbols.

On Linux systems, the <elf.h> header defines data types and constants that will be useful for parsing ELF files. <elf.h>

This header file is located in /usr/include/elf.h . You should open this file using a text editor so that you can see the definitions it contains.
As an example, the Elf64_Ehdr data type defines the layout of the ELF header for 64-bit ELF files. Here is an example of how your program could use this header. Let's say that data is a pointer to the beginning of the ELF file data in memory. The ELF header is always at the beginning of an ELF file. So, casting data to be a pointer to Elf64_Ehdr would allow your program to inspect the fields of the ELF header:

[One important detail to note here is that directly accessing fields of structs in a memory-mapped file will only yield the correct value if the byte ordering used in the file is the same as the byte ordering used by the system on Elf64_Ehdr *elf_header = (Elf64_Ehdr *) data;
printf(".shstrtab section index is %u\n", elf_header->e_shstrndx);

which the program is running. As mentioned earlier, you may assume this as a simplifying assumption in your code.]

Other important data types in <elf.h> include Elf64_Shdr , which describes the layout of a section header, and Elf64_Sym , which describes the layout of a symbol.

Note that there are 32-bit variants of each data type (e.g., Elf32_Ehdr ), but you are not really required to handle 32-bit ELF files.
In general, much of your program logic will be concerned with computing the address of a structure in the ELF data, and casting that address value to be a pointer to one of the data types in <elf.h> .

We highly recommend using the unsigned char * type as the data type for doing address computations. Pointer arithmetic using this type will be in units of bytes, and if you access an unsigned char value using such a pointer, you are guaranteed to see an unsigned value.

Suggested approach

Here is a suggested approach to finding the section and symbol table information:

1. Find the section headers using the e_shoff value in the ELF header. The number of section headers is indicated by the e_shnum value in the ELF header.

2. Use the e_shstrndx value in the ELF header to indicate which section contains the string table with the names of the sections. (This is the .shstrtab section.)

3. Scan through the section headers, which will be objects of type Elf64_Shdr . In each section header, the sh_offset value indicates the location of that section's data, the sh_size indicates the size of the section's data, and the sh_name field is the offset of the section's name string in the .shstrtab header. Make a note of the name of each section, and other required information. Based on your scan of the section headers, you should be able to print the required output for each section. See Required output below.

4. The .symtab section is a sequence of Elf64_Sym objects. Each one has an st_name field. The value of st_name , if it is not 0, is the offset of the symbol's name string in the .strtab section data. By scanning the symbols in the .symtab section data, you should be able to print the required information for each

Required output

Your program will be invoked as
where filename is the file to analyze. As a special case, if the file can't be opened or can't be mapped into memory, your program should print an error message to standard error and exit with a non-zero exit code.

If the file being analyzed is not an ELF file, your program should simply print

./magic filename

to standard output and exit with an exit code of 0.

Otherwise, the program should summarize the ELF header, summarize the section headers, and summarize the symbols, and then exit with an exit code of 0.

Not an ELF file

To summarize the information in the ELF header, your program should print three lines of the form

Object file type: objtype Instruction set: machtype Endianness: endianness

For objtype and machtype, translate the values of the e_type and e_machine fields of the ELF header to strings using the get_type_name and get_machine_name functions defined in elf_names.h and
elf_names.c / elf_names.cpp .

For endianness, print either Little endian or Big endian . (Endianness is found in the EI_DATA element of the e_ident array in the ELF header.)

601.229 (F21): Assignment 4: Analyzing ELF files

After the ELF header summary, the program should print one line of output for each section, in the following format:

Section header N: name=name, type=X, offset=Y, size=Z

Nis a section index in the range 0 to e_shnum -1. name is the section name, which will be a NTJL-terminated string value in the .shstrtab section data. X, Y, Z are the values of the section header's sh_type, sh_offset , and sh_size values, respectively. Each of these values should be printed using the %lx conversion using printf . Note that the name may be an empty string.

Symbol N: name=name, size=X, info=Y, other=Z

After the summary of section headers, the program should print one line of output for each symbol, in the following format:

Nis the index of the symbol (0 for first symbol), name is the name of the symbol based on the value of the symbol's st_name value (if non-zero, it specifies an offset in the .strtab section.) X, Y, Z are the values of the symbol's st_size, st_info , and st_other fields, respectively, printed using printf with the %lx conversion.

Attachment:- Analyzing ELF files.rar

Reference no: EM133028182

Questions Cloud

Explain relation of rooms division manager : Explain in detail relation of rooms division manager front office manager executive housekeeper and how their management allows them to effectively become a hos
Labour shortage and wage theft issues : What Human Resource Management practices could use for the suggestion of "Labour shortage and Wage theft issues"?
What two accounting principles that abc sdn bhd not follow : However, the accountant accidentally record the expenses associated with the undeclared sales. What are two Accounting Principles that ABC Sdn Bhd not follow
Ways for an entrepreneur to go beyond convention : 3. What are some ways for an entrepreneur to go beyond convention to design a product to appeal to a particular market?
Analyzing ELF files : You will use memory-mapped file I/O to open and read ELF files - skeleton projects are provided so that you can choose whether you want to write the program
Challenges confronted by the small micro enterprises : Identify and discuss with examples four (4) major challenges confronted by the small micro enterprises even before the onslaught of covid 19 as highlighted in t
How much should rm record as unearned franchise fees : On December 31, 2021, how much should RM record as unearned franchise fees in respect of the GINNY franchise
Overall international strategy of the organization : 1. Study one HR function of any existing global organization and describe their strategy with respect to this function.
Employees experiences of pandemic on boracay island : Topic: Local tourism employees' experiences of the pandemic on Boracay Island

Reviews

Write a Review

Basic Computer Science Questions & Answers

  Create a new database called membership2

Use the Management Studio to create a new database called Membership2 using the default settings. (If the database already exists, use the Management Studio to

  What is the cost of joining r and s using a hash join

What is the cost of joining R and S using a sort-merge join? What is the minimum number of buffer pages required for this cost to remain unchanged?

  Estimating the average variable cost

A perfectly competitive firm is selling 150 units of output per week at a price of $10. Average total cost is $11, average variable cost is $8, and marginal

  Significant impact on businesses and industries

We have viewed how Blockchain has made a significant impact on businesses and industries.

  Possesses some finite amount of magical energy

A spellcaster is given an object (e.g. mouse, owl, dwarf, purse ...) and a set of spells that she / he can use to transform the object. The effect of each spell is described by a set of from-to pairs. For example, spell "Chiroptera" may transform ob..

  Write vhdl code to compile the greatest common divisor

Write VHDL code to compite greatest common divisor of two integers using Euclidean algorithm

  Write a program that requests a person weight and height

Write a program that requests a person's weight and height as input and displays the person's body mass index.

  About the network security

Network Security-You must use a minimum of 5 references, citing the references where you used the material within the paper itself.

  Demonstrate the use of jdbc

Write a Java program (non-GUI preferred) to demonstrate the use of JDBC. The program should allow a user to do the following:

  What positive value of q will maximize total profit

What positive value of Q will maximize total profit? Remember, letting MR = MC signals the objective of total profit maximization. Solve MR = MC for Q.

  Creating an effective powerpoint

If you had to coach a fellow employee or student on creating an effective PowerPoint, what would be your top three tips?

  Focus on benefits of distributed of network management

To enrich your view of network management architectures (centralized, distributed or hierarchical) with focus on benefits of distributed of network management.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd