Finding maximum homework grade ECOO 2018Vitaly & Pie Programming challengeCounting number of...

Could solar power be utilized and substitute coal in the 19th century?

What is the term when two people sing in harmony, but they aren't singing the same notes?

How will losing mobility of one hand affect my career as a programmer?

Simple image editor tool to draw a simple box/rectangle in an existing image

Can somebody explain Brexit in a few child-proof sentences?

I'm in charge of equipment buying but no one's ever happy with what I choose. How to fix this?

Would it be legal for a US State to ban exports of a natural resource?

Freedom of speech and where it applies

Partial sums of primes

Can a controlled ghast be a leader of a pack of ghouls?

Superhero words!

How did Monica know how to operate Carol's "designer"?

Why isn't KTEX's runway designation 10/28 instead of 9/27?

node command while defining a coordinate in TikZ

For airliners, what prevents wing strikes on landing in bad weather?

Hostile work environment after whistle-blowing on coworker and our boss. What do I do?

Is infinity mathematically observable?

word describing multiple paths to the same abstract outcome

What if somebody invests in my application?

Proof of Lemma: Every integer can be written as a product of primes

Can I use my Chinese passport to enter China after I acquired another citizenship?

Is there any significance to the Valyrian Stone vault door of Qarth?

How can I raise concerns with a new DM about XP splitting?

Why does this part of the Space Shuttle launch pad seem to be floating in air?

Finding maximum homework grade ECOO 2018

Vitaly & Pie Programming challengeCounting number of inversionsProcessing data for five weather stationsFinding the maximum distance from the starting nodeFinding the maximum element of a StackFinding the maximum GCD of all pairsRegression on Pandas DataFrameTaxi fare calculator, trip recorder, and fuel calculatorContest Solution: Al Gore RhythmHackerRank: Fraudulent activity notification

I'm trying to solve ECOO 2018 round 2(regionals) question 2 to prepare for the upcoming contest. Basically, in the question George has to achieve the maximum possible grade by completing the right assignments. Each assignment has a deadline and a weight value. I've came up with two solutions yet both only work for 7 of the cases and are extremely slow for the other 3. Here is the problem statement:

Problem 2: Homework

George has procrastinated too much on his N homework assignments,
and now he is running out of time to finish them all.

Each of George’s N assignments has a weight that it contributes
to his grade and a deadline in days from today.
George will need one day to finish any of the assignments and he must
complete an assignment before it’s deadline in order to submit it
(he can’t complete it the day an assignment is due).

Help George figure out
the order in which he should complete his assignments
such that the total weight of the assignments he completes is maximized.

Input Specifications

DATA21.txt (DATA22.txt for the second try) will contain 10 datasets.
Each dataset begins with an integer N (1 ≤ N ≤ 1,000,000).

The next N lines contain an integer D and decimal W
(1 ≤ D ≤ 1,000,000; 0 < W ≤ 100), representing
an assignment that has a deadline that is D days from today
and a weight of W.

For the first seven cases, N ≤ 1000.

Output Specifications

For each dataset, output the maximum total weight of
the assignments that George can complete, rounded to 4 decimal places
(George is very meticulous about his grade).

Sample Input (Two Datasets Shown) Sample Output
3                                   3.0000

1 1.0                               17.0000

2 1.0

3 1.0

5

1 2.0  

1 1.0  

3 3.0  

7 10.0  

3 2.0  

(pixel raster of original Problem Statement (including it’s))

Solution #1:
In this solution I adopted the elimination approach.

I store the deadline along with the assignments due in this day in a dictionary.

Then I sort all the assignments (keys) and iterate sequentially, each time picking the highest d assignments where d is the deadline (because you cannot complete more than 1 assignment in a day, 3 in 3 days and so on).

I estimated the complexity to be O(dlogd + dwlogw).

def main():

    from collections import defaultdict



    with open("DATA21.txt") as all_data:

        my_lines = iter(all_data.readlines())



    number_of_assignments = int(next(my_lines))



    homework_dict = defaultdict(list)



    for _ in range(number_of_assignments):

        d, w = [float(i) for i in next(my_lines).split()]

        d = int(d)

        # Setting up the dictionary



        homework_dict[d].append(w)





    all_deadlines = list(homework_dict.keys())



    all_deadlines.sort()



    # Algorithm starts here

    selected_assignments = []

    for deadline in all_deadlines:

        deadline_assignments = homework_dict[deadline]

        deadline_assignments.extend(selected_assignments)

        deadline_assignments.sort()



        difference = len(deadline_assignments) - deadline

        if difference < 0:

            selected_assignments = deadline_assignments

        else:

            selected_assignments = deadline_assignments[difference::]



    tot = sum(selected_assignments)

    new = format(tot, ".4f")

    print(new)



main()

Solution 2: In this solution I work directly,

I create a 2-dimensional list and sort in reverse order so that the weights are first.

Then I iterate through this list and look if it's possible to complete the current assignment by the deadline. I do this by creating a list of all the days and removing each deadline I already completed.

I estimate the complexity to be around O(nlogn + nd).

def main():

    with open("DATA21.txt") as all_data:

        my_lines = iter(all_data.readlines())



    n = int(next(my_lines))



    def take_second(elem):

        return elem[1]



    biggest = 0

    deadlines_weights_list = []

    for i in range(n):

        d, w = [float(x) for x in next(my_lines).split()]

        d = int(d)

        if d > biggest:

            biggest = d

        deadlines_weights_list.append([d, w])



    deadlines_weights_list.sort(key=take_second, reverse=True)



    possible_days = [day+1 for day in range(biggest)]



    total = 0

    for deadline, weight in deadlines_weights_list:

        # If there are no days left the code should be terminated 

        if len(possible_days) == 0:

            break



        while deadline not in possible_days:

            deadline -= 1

            # This means the only days left are really high, with much more time. 

            if deadline < possible_days[0]:

                break



        if deadline in possible_days:

            # One day cannot be used twice 

            possible_days.remove(deadline)

            total += weight





    total = format(total, ".4f")

    print(total)



main()

I apologize for posting two codes in one review, I'm more concerned about finding an algorithm that is fast enough for all of test cases rather than comparing this codes. Both solutions work for 7 test cases out of 10 but exceed the time limit for the other 3. Would appreciate any suggestions to optimize this solutions or a completely new way to solve this challenge.

edited 12 hours ago

asked yesterday

Michael exe

112

New contributor

$begingroup$
Please include the problem description as text not an image.
$endgroup$
– Peilonrayz
yesterday

$begingroup$
Okay I edited it
$endgroup$
– Michael exe
yesterday

$begingroup$
Welcome to Code Review! two codes in one review if you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).
$endgroup$
– greybeard
yesterday

$begingroup$
Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
$endgroup$
– Graipher
16 hours ago

$begingroup$
yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
$endgroup$
– Michael exe
12 hours ago

add a comment |

Problem 2: Homework

George has procrastinated too much on his N homework assignments,
and now he is running out of time to finish them all.

Each of George’s N assignments has a weight that it contributes
to his grade and a deadline in days from today.
George will need one day to finish any of the assignments and he must
complete an assignment before it’s deadline in order to submit it
(he can’t complete it the day an assignment is due).

Help George figure out
the order in which he should complete his assignments
such that the total weight of the assignments he completes is maximized.

Input Specifications

DATA21.txt (DATA22.txt for the second try) will contain 10 datasets.
Each dataset begins with an integer N (1 ≤ N ≤ 1,000,000).

The next N lines contain an integer D and decimal W
(1 ≤ D ≤ 1,000,000; 0 < W ≤ 100), representing
an assignment that has a deadline that is D days from today
and a weight of W.

For the first seven cases, N ≤ 1000.

Output Specifications

For each dataset, output the maximum total weight of
the assignments that George can complete, rounded to 4 decimal places
(George is very meticulous about his grade).

Sample Input (Two Datasets Shown) Sample Output
3                                   3.0000

1 1.0                               17.0000

2 1.0

3 1.0

5

1 2.0  

1 1.0  

3 3.0  

7 10.0  

3 2.0  

(pixel raster of original Problem Statement (including it’s))

def main():

    from collections import defaultdict



    with open("DATA21.txt") as all_data:

        my_lines = iter(all_data.readlines())



    number_of_assignments = int(next(my_lines))



    homework_dict = defaultdict(list)



    for _ in range(number_of_assignments):

        d, w = [float(i) for i in next(my_lines).split()]

        d = int(d)

        # Setting up the dictionary



        homework_dict[d].append(w)





    all_deadlines = list(homework_dict.keys())



    all_deadlines.sort()



    # Algorithm starts here

    selected_assignments = []

    for deadline in all_deadlines:

        deadline_assignments = homework_dict[deadline]

        deadline_assignments.extend(selected_assignments)

        deadline_assignments.sort()



        difference = len(deadline_assignments) - deadline

        if difference < 0:

            selected_assignments = deadline_assignments

        else:

            selected_assignments = deadline_assignments[difference::]



    tot = sum(selected_assignments)

    new = format(tot, ".4f")

    print(new)



main()

def main():

    with open("DATA21.txt") as all_data:

        my_lines = iter(all_data.readlines())



    n = int(next(my_lines))



    def take_second(elem):

        return elem[1]



    biggest = 0

    deadlines_weights_list = []

    for i in range(n):

        d, w = [float(x) for x in next(my_lines).split()]

        d = int(d)

        if d > biggest:

            biggest = d

        deadlines_weights_list.append([d, w])



    deadlines_weights_list.sort(key=take_second, reverse=True)



    possible_days = [day+1 for day in range(biggest)]



    total = 0

    for deadline, weight in deadlines_weights_list:

        # If there are no days left the code should be terminated 

        if len(possible_days) == 0:

            break



        while deadline not in possible_days:

            deadline -= 1

            # This means the only days left are really high, with much more time. 

            if deadline < possible_days[0]:

                break



        if deadline in possible_days:

            # One day cannot be used twice 

            possible_days.remove(deadline)

            total += weight





    total = format(total, ".4f")

    print(total)



main()

edited 12 hours ago

asked yesterday

Michael exe

112

New contributor

$begingroup$
Please include the problem description as text not an image.
$endgroup$
– Peilonrayz
yesterday

$begingroup$
Okay I edited it
$endgroup$
– Michael exe
yesterday

$begingroup$
Welcome to Code Review! two codes in one review if you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).
$endgroup$
– greybeard
yesterday

$begingroup$
Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
$endgroup$
– Graipher
16 hours ago

$begingroup$
yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
$endgroup$
– Michael exe
12 hours ago

add a comment |

Problem 2: Homework

George has procrastinated too much on his N homework assignments,
and now he is running out of time to finish them all.

Each of George’s N assignments has a weight that it contributes
to his grade and a deadline in days from today.
George will need one day to finish any of the assignments and he must
complete an assignment before it’s deadline in order to submit it
(he can’t complete it the day an assignment is due).

Help George figure out
the order in which he should complete his assignments
such that the total weight of the assignments he completes is maximized.

Input Specifications

DATA21.txt (DATA22.txt for the second try) will contain 10 datasets.
Each dataset begins with an integer N (1 ≤ N ≤ 1,000,000).

The next N lines contain an integer D and decimal W
(1 ≤ D ≤ 1,000,000; 0 < W ≤ 100), representing
an assignment that has a deadline that is D days from today
and a weight of W.

For the first seven cases, N ≤ 1000.

Output Specifications

For each dataset, output the maximum total weight of
the assignments that George can complete, rounded to 4 decimal places
(George is very meticulous about his grade).

Sample Input (Two Datasets Shown) Sample Output
3                                   3.0000

1 1.0                               17.0000

2 1.0

3 1.0

5

1 2.0  

1 1.0  

3 3.0  

7 10.0  

3 2.0  

(pixel raster of original Problem Statement (including it’s))

def main():

    from collections import defaultdict



    with open("DATA21.txt") as all_data:

        my_lines = iter(all_data.readlines())



    number_of_assignments = int(next(my_lines))



    homework_dict = defaultdict(list)



    for _ in range(number_of_assignments):

        d, w = [float(i) for i in next(my_lines).split()]

        d = int(d)

        # Setting up the dictionary



        homework_dict[d].append(w)





    all_deadlines = list(homework_dict.keys())



    all_deadlines.sort()



    # Algorithm starts here

    selected_assignments = []

    for deadline in all_deadlines:

        deadline_assignments = homework_dict[deadline]

        deadline_assignments.extend(selected_assignments)

        deadline_assignments.sort()



        difference = len(deadline_assignments) - deadline

        if difference < 0:

            selected_assignments = deadline_assignments

        else:

            selected_assignments = deadline_assignments[difference::]



    tot = sum(selected_assignments)

    new = format(tot, ".4f")

    print(new)



main()

def main():

    with open("DATA21.txt") as all_data:

        my_lines = iter(all_data.readlines())



    n = int(next(my_lines))



    def take_second(elem):

        return elem[1]



    biggest = 0

    deadlines_weights_list = []

    for i in range(n):

        d, w = [float(x) for x in next(my_lines).split()]

        d = int(d)

        if d > biggest:

            biggest = d

        deadlines_weights_list.append([d, w])



    deadlines_weights_list.sort(key=take_second, reverse=True)



    possible_days = [day+1 for day in range(biggest)]



    total = 0

    for deadline, weight in deadlines_weights_list:

        # If there are no days left the code should be terminated 

        if len(possible_days) == 0:

            break



        while deadline not in possible_days:

            deadline -= 1

            # This means the only days left are really high, with much more time. 

            if deadline < possible_days[0]:

                break



        if deadline in possible_days:

            # One day cannot be used twice 

            possible_days.remove(deadline)

            total += weight





    total = format(total, ".4f")

    print(total)



main()

edited 12 hours ago

asked yesterday

Michael exe

112

New contributor

Problem 2: Homework

George has procrastinated too much on his N homework assignments,
and now he is running out of time to finish them all.

Each of George’s N assignments has a weight that it contributes
to his grade and a deadline in days from today.
George will need one day to finish any of the assignments and he must
complete an assignment before it’s deadline in order to submit it
(he can’t complete it the day an assignment is due).

Help George figure out
the order in which he should complete his assignments
such that the total weight of the assignments he completes is maximized.

Input Specifications

DATA21.txt (DATA22.txt for the second try) will contain 10 datasets.
Each dataset begins with an integer N (1 ≤ N ≤ 1,000,000).

The next N lines contain an integer D and decimal W
(1 ≤ D ≤ 1,000,000; 0 < W ≤ 100), representing
an assignment that has a deadline that is D days from today
and a weight of W.

For the first seven cases, N ≤ 1000.

Output Specifications

For each dataset, output the maximum total weight of
the assignments that George can complete, rounded to 4 decimal places
(George is very meticulous about his grade).

Sample Input (Two Datasets Shown) Sample Output
3                                   3.0000

1 1.0                               17.0000

2 1.0

3 1.0

5

1 2.0  

1 1.0  

3 3.0  

7 10.0  

3 2.0  

(pixel raster of original Problem Statement (including it’s))

def main():

    from collections import defaultdict



    with open("DATA21.txt") as all_data:

        my_lines = iter(all_data.readlines())



    number_of_assignments = int(next(my_lines))



    homework_dict = defaultdict(list)



    for _ in range(number_of_assignments):

        d, w = [float(i) for i in next(my_lines).split()]

        d = int(d)

        # Setting up the dictionary



        homework_dict[d].append(w)





    all_deadlines = list(homework_dict.keys())



    all_deadlines.sort()



    # Algorithm starts here

    selected_assignments = []

    for deadline in all_deadlines:

        deadline_assignments = homework_dict[deadline]

        deadline_assignments.extend(selected_assignments)

        deadline_assignments.sort()



        difference = len(deadline_assignments) - deadline

        if difference < 0:

            selected_assignments = deadline_assignments

        else:

            selected_assignments = deadline_assignments[difference::]



    tot = sum(selected_assignments)

    new = format(tot, ".4f")

    print(new)



main()

def main():

    with open("DATA21.txt") as all_data:

        my_lines = iter(all_data.readlines())



    n = int(next(my_lines))



    def take_second(elem):

        return elem[1]



    biggest = 0

    deadlines_weights_list = []

    for i in range(n):

        d, w = [float(x) for x in next(my_lines).split()]

        d = int(d)

        if d > biggest:

            biggest = d

        deadlines_weights_list.append([d, w])



    deadlines_weights_list.sort(key=take_second, reverse=True)



    possible_days = [day+1 for day in range(biggest)]



    total = 0

    for deadline, weight in deadlines_weights_list:

        # If there are no days left the code should be terminated 

        if len(possible_days) == 0:

            break



        while deadline not in possible_days:

            deadline -= 1

            # This means the only days left are really high, with much more time. 

            if deadline < possible_days[0]:

                break



        if deadline in possible_days:

            # One day cannot be used twice 

            possible_days.remove(deadline)

            total += weight





    total = format(total, ".4f")

    print(total)



main()

python sorting time-limit-exceeded dynamic-programming

edited 12 hours ago

asked yesterday

Michael exe

112

New contributor

edited 12 hours ago

asked yesterday

Michael exe

112

New contributor

edited 12 hours ago

asked yesterday

Michael exe

112

New contributor

asked yesterday

Michael exe

112

asked yesterday

Michael exe

112

New contributor

Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

$begingroup$
Please include the problem description as text not an image.
$endgroup$
– Peilonrayz
yesterday

$begingroup$
Okay I edited it
$endgroup$
– Michael exe
yesterday

$begingroup$
Welcome to Code Review! two codes in one review if you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).
$endgroup$
– greybeard
yesterday

$begingroup$
Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
$endgroup$
– Graipher
16 hours ago

$begingroup$
yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
$endgroup$
– Michael exe
12 hours ago

add a comment |

$begingroup$
Please include the problem description as text not an image.
$endgroup$
– Peilonrayz
yesterday

$begingroup$
Okay I edited it
$endgroup$
– Michael exe
yesterday

$begingroup$
Welcome to Code Review! two codes in one review if you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).
$endgroup$
– greybeard
yesterday

$begingroup$
Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
$endgroup$
– Graipher
16 hours ago

$begingroup$
yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
$endgroup$
– Michael exe
12 hours ago

Please include the problem description as text not an image.

– Peilonrayz
yesterday

Okay I edited it

– Michael exe
yesterday

Welcome to Code Review! two codes in one review if you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).

– greybeard
yesterday

Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.

– Graipher
16 hours ago

yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)

– Michael exe
12 hours ago

add a comment |

1 Answer
1

active

oldest

votes

Approach 1

def main():

    ... monolithic method ...



main()

Either this is a throwaway program, in which case you don't need to define main at all, or it isn't, in which case you should only call main() if __name__ == "__main__".

    from collections import defaultdict



    with open("DATA21.txt") as all_data:

        my_lines = iter(all_data.readlines())



    number_of_assignments = int(next(my_lines))



    homework_dict = defaultdict(list)



    for _ in range(number_of_assignments):

        d, w = [float(i) for i in next(my_lines).split()]

        d = int(d)

        # Setting up the dictionary



        homework_dict[d].append(w)

This looks to me like a method which should be factored out, so that there is clear isolation between the I/O and the processing.

I don't understand the use of float and int. Would it not be clearer to avoid the double-coercion with something like this?

        d, w = next(my_lines).split()

        homework_dict[int(d)].append(float(w))

    all_deadlines = list(homework_dict.keys())



    all_deadlines.sort()

I'm not sure what all_ adds to the name. deadlines works for me, or distinct_deadlines to be more explicit.

    # Algorithm starts here

See my previous point about factoring out methods.

    selected_assignments = []

    for deadline in all_deadlines:

        deadline_assignments = homework_dict[deadline]

        deadline_assignments.extend(selected_assignments)

        deadline_assignments.sort()

This is going to be doing a lot of sorts of already sorted data. That seems like a plausible bottleneck. I see at least three obviously better approaches:

Sort homework_dict[deadline] and then merge two sorted lists in linear time.

Use a priority queue of some kind for selected_assignments. Add homework_dict[deadline] to the queue and then pop it down to the desired length.

The asymptotically best option that I can see, although possibly not the fastest in practice, would be to append the new assignments and then use a linear time median finder to slice the merged list.

        difference = len(deadline_assignments) - deadline

        if difference < 0:

            selected_assignments = deadline_assignments

        else:

            selected_assignments = deadline_assignments[difference::]

difference is not an informative name. What does the difference actually mean? It's something like "number of assignments to discard". I might use excess as a name, or num_excess_assignments if I wanted to be more explicit.

It would be nice to see some comments in the code explaining why it's correct. I would say that having read it carefully a few times I find its correctness plausible, but I wouldn't be willing to guarantee it.

Approach 2

A lot of the comments (about structure, names, proof of correctness, ...) on approach 1 apply here too.

    possible_days = [day+1 for day in range(biggest)]

What is possible_days used for? in possible_days and possible_days.remove. This is a classic performance blunder (so don't feel bad, because you're far from the first person to stumble into it: clearly there are popular learning resources which should address this and don't). In general, the appropriate data structure for this use case is set, not list.

        while deadline not in possible_days:

            deadline -= 1

            # This means the only days left are really high, with much more time. 

            if deadline < possible_days[0]:

                break

However, it turns out that the use of possible_days is even more specialised than it seemed at first. Even if you replace the list with a set, this would still be a linear scan, and it doesn't need to be. You want the largest value smaller than a given one: if you keep the values in order then a binary chop will get there in logarithmic time rather than linear.

        if deadline in possible_days:

            # One day cannot be used twice 

            possible_days.remove(deadline)

            total += weight

However, updating the array needed for binary chop would not be logarithmic time. What you need is some kind of tree which allows searches and removals in logarithmic time. The access pattern is not going to be random, so to guarantee logarithmic time it probably needs to be a self-balancing tree, or some kind of self-pruning trie. I'm not aware of a suitable implementation in the standard library: I think you would have to write it yourself.

Actually, thinking about it some more, I reckon you could use an augmented union-find data structure to get amortised linear time. Initially each index (and -1 as a sentinel) is in a singleton set with associated data equal to the index. The search and delete becomes (pseudocode):

available = find(deadline).data

if available >= 0:

    total += weight

    union(available, available - 1)

ensuring that union chooses the smaller of the two associated data items.

edited 6 hours ago

answered 9 hours ago

Peter Taylor

18.1k2963

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Michael exe is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f216115%2ffinding-maximum-homework-grade-ecoo-2018%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Approach 1

def main():

    ... monolithic method ...



main()

Either this is a throwaway program, in which case you don't need to define main at all, or it isn't, in which case you should only call main() if __name__ == "__main__".

    from collections import defaultdict



    with open("DATA21.txt") as all_data:

        my_lines = iter(all_data.readlines())



    number_of_assignments = int(next(my_lines))



    homework_dict = defaultdict(list)



    for _ in range(number_of_assignments):

        d, w = [float(i) for i in next(my_lines).split()]

        d = int(d)

        # Setting up the dictionary



        homework_dict[d].append(w)

This looks to me like a method which should be factored out, so that there is clear isolation between the I/O and the processing.

I don't understand the use of float and int. Would it not be clearer to avoid the double-coercion with something like this?

        d, w = next(my_lines).split()

        homework_dict[int(d)].append(float(w))

    all_deadlines = list(homework_dict.keys())



    all_deadlines.sort()

I'm not sure what all_ adds to the name. deadlines works for me, or distinct_deadlines to be more explicit.

    # Algorithm starts here

See my previous point about factoring out methods.

    selected_assignments = []

    for deadline in all_deadlines:

        deadline_assignments = homework_dict[deadline]

        deadline_assignments.extend(selected_assignments)

        deadline_assignments.sort()

This is going to be doing a lot of sorts of already sorted data. That seems like a plausible bottleneck. I see at least three obviously better approaches:

Sort homework_dict[deadline] and then merge two sorted lists in linear time.

Use a priority queue of some kind for selected_assignments. Add homework_dict[deadline] to the queue and then pop it down to the desired length.

The asymptotically best option that I can see, although possibly not the fastest in practice, would be to append the new assignments and then use a linear time median finder to slice the merged list.

        difference = len(deadline_assignments) - deadline

        if difference < 0:

            selected_assignments = deadline_assignments

        else:

            selected_assignments = deadline_assignments[difference::]

Approach 2

A lot of the comments (about structure, names, proof of correctness, ...) on approach 1 apply here too.

    possible_days = [day+1 for day in range(biggest)]

        while deadline not in possible_days:

            deadline -= 1

            # This means the only days left are really high, with much more time. 

            if deadline < possible_days[0]:

                break

        if deadline in possible_days:

            # One day cannot be used twice 

            possible_days.remove(deadline)

            total += weight

available = find(deadline).data

if available >= 0:

    total += weight

    union(available, available - 1)

ensuring that union chooses the smaller of the two associated data items.

edited 6 hours ago

answered 9 hours ago

Peter Taylor

18.1k2963

add a comment |

Approach 1

def main():

    ... monolithic method ...



main()

Either this is a throwaway program, in which case you don't need to define main at all, or it isn't, in which case you should only call main() if __name__ == "__main__".

    from collections import defaultdict



    with open("DATA21.txt") as all_data:

        my_lines = iter(all_data.readlines())



    number_of_assignments = int(next(my_lines))



    homework_dict = defaultdict(list)



    for _ in range(number_of_assignments):

        d, w = [float(i) for i in next(my_lines).split()]

        d = int(d)

        # Setting up the dictionary



        homework_dict[d].append(w)

This looks to me like a method which should be factored out, so that there is clear isolation between the I/O and the processing.

I don't understand the use of float and int. Would it not be clearer to avoid the double-coercion with something like this?

        d, w = next(my_lines).split()

        homework_dict[int(d)].append(float(w))

    all_deadlines = list(homework_dict.keys())



    all_deadlines.sort()

I'm not sure what all_ adds to the name. deadlines works for me, or distinct_deadlines to be more explicit.

    # Algorithm starts here

See my previous point about factoring out methods.

    selected_assignments = []

    for deadline in all_deadlines:

        deadline_assignments = homework_dict[deadline]

        deadline_assignments.extend(selected_assignments)

        deadline_assignments.sort()

This is going to be doing a lot of sorts of already sorted data. That seems like a plausible bottleneck. I see at least three obviously better approaches:

Sort homework_dict[deadline] and then merge two sorted lists in linear time.

Use a priority queue of some kind for selected_assignments. Add homework_dict[deadline] to the queue and then pop it down to the desired length.

The asymptotically best option that I can see, although possibly not the fastest in practice, would be to append the new assignments and then use a linear time median finder to slice the merged list.

        difference = len(deadline_assignments) - deadline

        if difference < 0:

            selected_assignments = deadline_assignments

        else:

            selected_assignments = deadline_assignments[difference::]

Approach 2

A lot of the comments (about structure, names, proof of correctness, ...) on approach 1 apply here too.

    possible_days = [day+1 for day in range(biggest)]

        while deadline not in possible_days:

            deadline -= 1

            # This means the only days left are really high, with much more time. 

            if deadline < possible_days[0]:

                break

        if deadline in possible_days:

            # One day cannot be used twice 

            possible_days.remove(deadline)

            total += weight

available = find(deadline).data

if available >= 0:

    total += weight

    union(available, available - 1)

ensuring that union chooses the smaller of the two associated data items.

edited 6 hours ago

answered 9 hours ago

Peter Taylor

18.1k2963

add a comment |

Approach 1

def main():

    ... monolithic method ...



main()

Either this is a throwaway program, in which case you don't need to define main at all, or it isn't, in which case you should only call main() if __name__ == "__main__".

    from collections import defaultdict



    with open("DATA21.txt") as all_data:

        my_lines = iter(all_data.readlines())



    number_of_assignments = int(next(my_lines))



    homework_dict = defaultdict(list)



    for _ in range(number_of_assignments):

        d, w = [float(i) for i in next(my_lines).split()]

        d = int(d)

        # Setting up the dictionary



        homework_dict[d].append(w)

This looks to me like a method which should be factored out, so that there is clear isolation between the I/O and the processing.

I don't understand the use of float and int. Would it not be clearer to avoid the double-coercion with something like this?

        d, w = next(my_lines).split()

        homework_dict[int(d)].append(float(w))

    all_deadlines = list(homework_dict.keys())



    all_deadlines.sort()

I'm not sure what all_ adds to the name. deadlines works for me, or distinct_deadlines to be more explicit.

    # Algorithm starts here

See my previous point about factoring out methods.

    selected_assignments = []

    for deadline in all_deadlines:

        deadline_assignments = homework_dict[deadline]

        deadline_assignments.extend(selected_assignments)

        deadline_assignments.sort()

This is going to be doing a lot of sorts of already sorted data. That seems like a plausible bottleneck. I see at least three obviously better approaches:

Sort homework_dict[deadline] and then merge two sorted lists in linear time.

Use a priority queue of some kind for selected_assignments. Add homework_dict[deadline] to the queue and then pop it down to the desired length.

The asymptotically best option that I can see, although possibly not the fastest in practice, would be to append the new assignments and then use a linear time median finder to slice the merged list.

        difference = len(deadline_assignments) - deadline

        if difference < 0:

            selected_assignments = deadline_assignments

        else:

            selected_assignments = deadline_assignments[difference::]

Approach 2

A lot of the comments (about structure, names, proof of correctness, ...) on approach 1 apply here too.

    possible_days = [day+1 for day in range(biggest)]

        while deadline not in possible_days:

            deadline -= 1

            # This means the only days left are really high, with much more time. 

            if deadline < possible_days[0]:

                break

        if deadline in possible_days:

            # One day cannot be used twice 

            possible_days.remove(deadline)

            total += weight

available = find(deadline).data

if available >= 0:

    total += weight

    union(available, available - 1)

ensuring that union chooses the smaller of the two associated data items.

edited 6 hours ago

answered 9 hours ago

Peter Taylor

18.1k2963

Approach 1

def main():

    ... monolithic method ...



main()

Either this is a throwaway program, in which case you don't need to define main at all, or it isn't, in which case you should only call main() if __name__ == "__main__".

    from collections import defaultdict



    with open("DATA21.txt") as all_data:

        my_lines = iter(all_data.readlines())



    number_of_assignments = int(next(my_lines))



    homework_dict = defaultdict(list)



    for _ in range(number_of_assignments):

        d, w = [float(i) for i in next(my_lines).split()]

        d = int(d)

        # Setting up the dictionary



        homework_dict[d].append(w)

This looks to me like a method which should be factored out, so that there is clear isolation between the I/O and the processing.

I don't understand the use of float and int. Would it not be clearer to avoid the double-coercion with something like this?

        d, w = next(my_lines).split()

        homework_dict[int(d)].append(float(w))

    all_deadlines = list(homework_dict.keys())



    all_deadlines.sort()

I'm not sure what all_ adds to the name. deadlines works for me, or distinct_deadlines to be more explicit.

    # Algorithm starts here

See my previous point about factoring out methods.

    selected_assignments = []

    for deadline in all_deadlines:

        deadline_assignments = homework_dict[deadline]

        deadline_assignments.extend(selected_assignments)

        deadline_assignments.sort()

This is going to be doing a lot of sorts of already sorted data. That seems like a plausible bottleneck. I see at least three obviously better approaches:

Sort homework_dict[deadline] and then merge two sorted lists in linear time.

Use a priority queue of some kind for selected_assignments. Add homework_dict[deadline] to the queue and then pop it down to the desired length.

The asymptotically best option that I can see, although possibly not the fastest in practice, would be to append the new assignments and then use a linear time median finder to slice the merged list.

        difference = len(deadline_assignments) - deadline

        if difference < 0:

            selected_assignments = deadline_assignments

        else:

            selected_assignments = deadline_assignments[difference::]

Approach 2

A lot of the comments (about structure, names, proof of correctness, ...) on approach 1 apply here too.

    possible_days = [day+1 for day in range(biggest)]

        while deadline not in possible_days:

            deadline -= 1

            # This means the only days left are really high, with much more time. 

            if deadline < possible_days[0]:

                break

        if deadline in possible_days:

            # One day cannot be used twice 

            possible_days.remove(deadline)

            total += weight

available = find(deadline).data

if available >= 0:

    total += weight

    union(available, available - 1)

ensuring that union chooses the smaller of the two associated data items.

edited 6 hours ago

answered 9 hours ago

Peter Taylor

18.1k2963

edited 6 hours ago

answered 9 hours ago

Peter Taylor

18.1k2963

answered 9 hours ago

Peter Taylor

18.1k2963

answered 9 hours ago

Peter Taylor

18.1k2963

add a comment |

Michael exe is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Michael exe is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ggthjy