Draw a DFA for a lexical analyzer

Assignment Help Computer Engineering
Reference no: EM132706803

Assignment 1 - Compiler Design

1 Regular Expressions

1. Following the POSIX ERE syntax, give a regular expression that matches the date and temperature pairs reasonably similar to the following positive examples. Your expression must not match any of the negative examples.

Positive examples         Negative examples

01-12-2000 0K              1-9-2019 -4K

01-09-2019 40.2?C        12-09-2017 22.?C

28-03-2019 50?F           28-13-2019 50 ?F

01-12-2530 -2.33?C      01-12-2017 -5

Notice the following:
• There should be no whitespace before the temperature unit (which can be 'K', '°C' or '°F').
• A Kelvin temperature cannot be negative.
• There should be some whitespace(s) between the date and temperature.
• The day field must have a numerical value between 01 and 31 and the month field between 01 and 12. Years always have four digits.
• A decimal number should have digits before and after the decimal point. Decimal numbers can have at most two digits to the right of the decimal point.

2. Give short answers: Your regular expression may also match some strings that do not correspond to actual dates on a calendar. What additional checks are needed to filter out those?

Lexical Analysis

Function definitions in Erlang, which is a concurrent functional programming language, look like this:

decrease_to_zero(Argument1) ->
case Argument1 > 0 of
true ->
LocalVariable = Argument1 - 1, decrease_to_zero(LocalVariable);
false ->
<<"zero?", Argument1>>
end.

Most elements here are similar to those in most programming languages. More specifically this code snippet contains integers (0, and 1), strings ("zero?"), arithmetic operators (-), comparison operators (>), other operators (like = which in Erlang denotes pattern matching), keywords of the language (case, of, and end), and other miscellaneous elements of the language ((, ), ,, ->, <<, and >>)2. In addition, this code contains Erlang variables (Argument1 and LocalVariable) and atoms (decrease_to_zero, true, and false).
In Erlang, variables begin with a capital letter and atoms begin with a lowercase letter. Both categories can subsequently contain letters, digits, and underscores.

Erlang's own lexical analyzer returns the following list of tokens for this code. For tokens that only have one element in the corresponding category, we only show the token itself; tokens that are of general categories are shown as pairs where the first element shows the token category and the second element shows the lexeme.

[{id, decrease_to_zero}, '(', {var, Argument1}, ')', '->',
case, {var, Argument1}, '>', {integer, 0}, of,
{boolean, true}, '->',
{var, LocalVariable}, '=', {var, Argument1}, '-', {integer, 1}, ',',
{id, decrease_to_zero}, '(', {var, LocalVariable}, ')', ';',
{boolean, false}, '->',
'<<', {string, "zero?"}, ',', {var, Argument1}, '>>',
end, '.']

Tasks:
1. Write regular expressions that can match Erlang's:
(a) integers
(b) atoms
(c) variables
(d) strings; it should be possible for a string to contain quote characters; use '\' as an escape character4
2. Draw a DFA (or NFA)5 for a lexical analyzer that can recognize all the token categories present in the excerpt. Mark each accepting state with the label of the corresponding token. Your analyzer must not recognize more syntactic elements of the language (e.g., other comparison operators) than those necessary for scanning the code above but should be able to recognize different programs with the same elements.
3. Describe in detail (i.e. including all transitions from the initial state until a token or an error is emitted) how the following strings will be split into tokens by the lexical analyzer you defined in part 2.
(a) >>>=>>>->
(b) >>>->>-
4. What tokens (or errors) will be returned by your lexical analyzer for the following inputs?
(a) case factorial(Input) of 120 -> "Input was 5"; Other -> Other end
(b) OneOfTheseIsATrap = <<"be careful">> < <<65>>.
(c) OneOfTheseIsATrap = <<"be careful">> <<<65>>.
5. Give short answers:
(a) Lexical analyzers often treat consecutive whitespace characters as a single token, or ignore them altogether (emitting no tokens). Give examples where such behaviours are undesirable (one where a lexical analyzer should not ignore whitespace characters and one where it should additionally return separate tokens for each).
(b) Lexical analyzers normally ignore (do not emit tokens for) "comments". Give an example where such behaviour is undesirable.

Assignment 2

1 Parsing and Semantic Actions

Dice notation is a system to represent di?erent combi- nations of dice in role-playing games using simple algebra-like notation. The syntax of dice notation follows the following grammar, where all symbols in typewritter font (+, x, d, and int) are terminal symbols:
Rolls → Roll | Sum
Sum → Sum1 + Rolls | Roll
Roll → Dice | int | Dice x int
Dice → int1 d int2 | d int
A sample string of the language is 42 + 4d7x3 + d20
1. Is the grammar ambiguous? If it is, give an example. Otherwise, justify why it is not.
2. Give a leftmost derivation for the sample string.
3. Draw a parse tree for the sample string.

4. Write semantic actions to calculate the result of a dice roll expressed in dice notation. You may assume that in your actions you can also call the function random(n) which returns an integer between 1 and n (inclusive). You can execute arbitrary code in semantic actions (e.g., a for loop might come in handy), as long as you also assign some value to the left-hand side symbol. The final result should be stored as the value of the start symbol (i.e., Rolls.val).
5. Label/number your semantic actions. Describe an evaluation of the sample input, using the tree from question 3 and mentioning the order in which the actions are executed (use arbitrary random values for the results of calls to random).

2 LL(1) Grammars

In the following context-free grammar, which describes expressions in a simplified functional language, the symbols +, (, ), ->, and <id> are terminals and S is the initial symbol.
S → F + S | <id> -> S | F F → A V
A → F | ?
V → <id> | ( S )

1. Explain briefly why this grammar cannot be directly parsed by an LL(1) parser.

2. Convert this grammar to an equivalent one that has a better chance to be LL(1).

3. Write the First and Follow sets for all non-terminal symbols of the new grammar.

4. Construct the complete LL(1) parsing table for the new grammar.

5. Consider the tokens x and f to belong to the <id> token class. Show all the steps required to parse the input string: f(x -> x + f f x).

Assignment 3

1 LR(1) Grammars
Consider the following grammar, where {a,b,c,d,e} are terminal symbols and S is the initial symbol.
S → A a (1)
S → e (2)
A → d B (3)

A → b B S (4)

B → c B (5)
B → ? (6)

1. Write the First sets for all non-terminal symbols of the grammar.

2. Construct the full LR(1) DFA, showing all items in each state.

3. Construct the LR(1) parsing table using the DFA. For the reduce actions, use the provided enumeration of the productions in the grammar.

4. Show all steps required to parse the following string: bbcceaa

5. Is this grammar LALR(1)?

2 Activation Records

In the following listing we show an excerpt of a program in a language that allows nested definitions for functions and uses call-by-value:int collatz_length(int n, int len) {

boolean collatz_base() { return n == 34;
}

if (collatz_base()) return len + 13;
else
if ((n mod 2) == 1)
return collatz_length(3 * n + 1, len + 1); else
return collatz_length(n div 2, len + 1);
}

Suppose we have in our main() function a call to collatz length(7, 0). Assume that the compiler allocates all the variables (i.e. both named and temporary) on the stack.
Questions
1. What will the return value of this call be?
2. What is the maximum number of activation records that will ever be in the stack before the call from
main returns? (Assume that the compiler does not optimize tail calls.)

3. Give a diagram of the state of the stack just before collatz base returns true. Your diagram should contain at least the following information:
(a) Boundaries of the activation records.
(b) Location and content of the actual parameters for the calls to the functions.
(c) Where in the stack is the value of n, for which collatz base returns true.
(d) Location of the return values.
(e) Location of the return addresses.
(f) Location and content of the control links and access links.
(g) Optional: Location and content of temporary variables.
4. How should the value of n be retrieved, in order to make the comparison in collatz base?

5. Optional: Describe briefly the tail call (or tail recursion) optimization and whether it can be applied to this code snippet. You can read about it in the Wikipedia.

Attachment:- CSE_assignments.rar

Reference no: EM132706803

Questions Cloud

Make a budget showing quantity of solvent to purchased : Make a budget showing the quantity of solvent Q80 to be purchased for July, August, and September, and for the quarter in total.
Explain the quantitative analysis of options : Using your selected BA plan template, detail the solution you have selected and the criteria and evaluation instruments used. This document is an interim step.
Which of the statements regarding variable cost is true : Which of the statements regarding variable cost is true? Variable cost increases on a per unit basis as the number of units produced increases.
Exposing the various aspects of political process : What specific roles do both media forums that you chose have in exposing the various aspects of a political process?
Draw a DFA for a lexical analyzer : Lexical analyzer that can recognize all the token categories present in the excerpt. Mark each accepting state with the label of the corresponding token
Determine how much should operating income increase : Assume that the operating results for last year, The president expects sales to increase by 12% next year. By how much should operating income increase?
Assess the risk of material misstatement at assertion level : The bonds paid 7% interest every December 31, For each of the accounting issues, assess the risk of material misstatement at the assertion level
How does a high employee turnover rate impact : Employee turnover can be very costly to an organization in terms of lost productivity as well as added recruitment costs. While some turnover is to be expected.
Pros and cons on both sides of the debate about laws : Summarize recent developments in several states enacting voter ID laws. Explain the pros and cons on both sides of the debate about these laws.

Reviews

Write a Review

Computer Engineering Questions & Answers

  Calculate the corresponding phase shifts

Use trigonometric identity #5 to derive an expression for cos 8? in terms of cos 9?. cos 7?. and cos ?.

  Network in a gsm system

Consider two mobiles belonging to the same home network in a GSM system. Explain step by step the process in GSM roaming, when one mobile wakes up in a foreign network and makes a call to the other.

  Define the human elements in it strategy

The success factors for your organization's information technology (IT) were identified in the previous assignment. Now, you can turn your attention.

  What are the two most important benefits of java language

What are the two most important benefits of the Java language? How long does it take to learn the entire Java library?

  Application showing sizes of two files and their ratio

Build a file which consists of your favourite movie quote. Make a use of a text editor like Notepad and save file as Quote.txt. Copy contents of the file and paste them into the word-processing program like Word. Save file as Quote.doc. Write down..

  Create a summary of stats for the dataset

Create a summary of stats for the dataset. (provide a screen shot). Create a correlation of stats for the dataset. (provide a screen shot).

  Describe one application relevant for ig based energy system

IGs need dc-ac, ac-dc, and ac-ac converters. Describe one application, relevant for IG-based energy system, for each of those power conversions.

  Role of the nonce in preventing reuse key streams

Describe the role of the nonce in preventing reuse key streams when using the same passphrase to encrypt different files.

  What could have been done differently to protect user data

Find a recent article that relates to data leakage related to user information. How was the information accessed? What were the ramifications?

  How to prove that a connected graph g without cycles

How to prove that a connected graph G without cycles has n-1 edges, where n > 0 is the number of nodes of G, using induction?

  Determine the z-transforms and sketch the roc

Design a digital Chebyshev-II filter and determine the coefficients of the impulse response h(n) of the FIR filter and Determine the Z-transforms and sketch the ROC

  Calculate the total sale for the week and display the result

Write a python program that asks the user to enter a store's sales for each day of the week.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd