Reference no: EM131103662
There are number sentences in the attached file. The words in these sentences are annotated with set of tags. I would like to have a script that splits the sentences into chunks based on the annotation tags (tag1 and tag2). The splitted chunks must be written into two files based the annotation tags. These files are tag1 file and tag2 file. Information such as sentence number, Chunk number and word index must be maintained.
To clarify I will use the example in the attached file:
There are 4 sentences:
sentence1: He is a good person .
sentence2: Thank you so much !!
sentence3: john likes to play with his friends .
sentence4: Netflix has almost 75 million global subscribers.
These sentences must be splitted into chunks and written into two files as following:
in the attached file there are 4 sentences:
sentence1: He is a good person .
sentence2: Thank you so much !!
sentence3: john likes to play with his friends .
sentence4: Netflix has almost 75 million global subscribers.
For sentences1, there are two chunks:
Chunk-1: He is >> written in tag1 file
Chunk-2: a good person . >>> written in tag2 file
For sentences2, there are two chunks:
Chunk-1:Thank you >> written in tag2 file
Chunk-2: so much !! >> written in tag1 file
For sentences3, there are four chunks:
Chunk-1: John Adam likes >> written in tag1 file
Chunk-2: to play >> written in tag2 file
Chunk-3: with his >> written in tag1 file
Chunk-4:friend :) >> written in tag2 file
For sentences4, there are four chunks:
Chunk-1:Netflix has >> written in tag1 file
Chunk-2: almost 75 million >> written in tag2 file
Chunk-3: global >> written in tag1 file
Chunk-4: subscribers >> written in tag2 file
As I mention above the following information must be maintained: sentence number, chunk number and word index.Maintaining these information is helpful to re-construct the sentences. So the script should be able to use the information from the two files (tag1 and tag2 files) to form the original file ( the attached file).
I'm attaching just a sample of sentences. I will test the script on the original file that includes a huge number of sentences.
you can write two scripts one for splitting into two files and the other for joining the two files to form the original file, or just write one script that can do the tasks.
Word-Index |
Word |
Tag |
0 |
He |
tag1 |
1 |
is |
tag1 |
2 |
a |
tag2 |
3 |
good |
tag2 |
4 |
person |
tag2 |
5 |
. |
punctuation |
0 |
Thank |
tag2 |
1 |
you |
tag2 |
2 |
so |
tag1 |
3 |
much |
tag1 |
4 |
!! |
punctuation |
0 |
John |
NE |
1 |
Adam |
NE |
2 |
likes |
tag1 |
3 |
to |
tag2 |
4 |
play |
tag2 |
5 |
with |
tag1 |
6 |
his |
tag1 |
7 |
friends |
tag2 |
8 |
:) |
emoticon |
0 |
Netflix |
NE |
1 |
has |
tag1 |
2 |
almost |
tag2 |
3 |
75 |
number |
4 |
million |
number |
5 |
global |
tag1 |
6 |
subscribers |
tag2 |
Tanf block grant should be increased
: Write a research paper with good data and references about the topic, "TANF Block Grant Should Be Increased." Please note that TANF stands for Temporary Assistance for Needy Families.
|
Identify describe and critique the research design being
: Identify, describe, and critique the research design being used for each of your NINR Landmark studies. Include strengths and limitations to the research design based on the course readings.
|
Among the customers patronizing the two types of speakeasies
: During Prohibition, some speakeasy operators paid bribes to ensure that the police did not raid them. Would you expect that the quality of the liquor served in such speakeasies to be higher or lower than in those that did not pay bribes? Would you ex..
|
The firm were a single-price monopoly
: If a monopoly faces an inverse demand curve of p = 90 - Q, has a constant marginal and average cost of 30, and can perfectly price discriminate, what is its profit? What are the consumer surplus, welfare, and deadweight loss? How would these results ..
|
Write a script that splits the sentences into chunks based
: There are number sentences in the attached file. The words in these sentences are annotated with set of tags. I would like to have a script that splits the sentences into chunks based on the annotation tags (tag1 and tag2). The splitted chunks must b..
|
Implement a web application that works flawlessly on mobile
: You are asked to implement a web application that works flawlessly on mobile devices. The simple requirements for the application are described as follows: To write a web app that will allow for someone to enter in course information (name, credits, ..
|
Explain the impact of occupational segregation
: Use a demand-and-supply model to explain the impact of occupational segregation or “crowding” on the relative wage rates and earnings of men and women. Who gains and who loses from the elimination of occupational segregation? Is there a net gain or a..
|
How much will your firms total revenues
: You are the manager of a firm that receives revenues of $20,000 per year from product X and $80,000 per year from product Y. The own price elasticity of demand for product X is -3, and the cross-price elasticity of demand between product Y and X is -..
|
Competition might reduce discrimination in the long run
: Males under the age of 25 must pay far higher auto insurance premiums than females in this age group. How does this fact relate to statistical discrimination? Statistical discrimination implies that discrimination can persist indefinitely, while the ..
|