Finding maximum homework grade ECOO 2018Vitaly & Pie Programming challengeCounting number of...
Could solar power be utilized and substitute coal in the 19th century?
What is the term when two people sing in harmony, but they aren't singing the same notes?
How will losing mobility of one hand affect my career as a programmer?
Simple image editor tool to draw a simple box/rectangle in an existing image
Can somebody explain Brexit in a few child-proof sentences?
I'm in charge of equipment buying but no one's ever happy with what I choose. How to fix this?
Would it be legal for a US State to ban exports of a natural resource?
Freedom of speech and where it applies
Partial sums of primes
Can a controlled ghast be a leader of a pack of ghouls?
Superhero words!
How did Monica know how to operate Carol's "designer"?
Why isn't KTEX's runway designation 10/28 instead of 9/27?
node command while defining a coordinate in TikZ
For airliners, what prevents wing strikes on landing in bad weather?
Hostile work environment after whistle-blowing on coworker and our boss. What do I do?
Is infinity mathematically observable?
word describing multiple paths to the same abstract outcome
What if somebody invests in my application?
Proof of Lemma: Every integer can be written as a product of primes
Can I use my Chinese passport to enter China after I acquired another citizenship?
Is there any significance to the Valyrian Stone vault door of Qarth?
How can I raise concerns with a new DM about XP splitting?
Why does this part of the Space Shuttle launch pad seem to be floating in air?
Finding maximum homework grade ECOO 2018
Vitaly & Pie Programming challengeCounting number of inversionsProcessing data for five weather stationsFinding the maximum distance from the starting nodeFinding the maximum element of a StackFinding the maximum GCD of all pairsRegression on Pandas DataFrameTaxi fare calculator, trip recorder, and fuel calculatorContest Solution: Al Gore RhythmHackerRank: Fraudulent activity notification
$begingroup$
I'm trying to solve ECOO 2018 round 2(regionals) question 2 to prepare for the upcoming contest. Basically, in the question George has to achieve the maximum possible grade by completing the right assignments. Each assignment has a deadline and a weight value. I've came up with two solutions yet both only work for 7 of the cases and are extremely slow for the other 3. Here is the problem statement:
Problem 2: Homework
George has procrastinated too much on his N homework assignments,
and now he is running out of time to finish them all.
Each of George’s N assignments has a weight that it contributes
to his grade and a deadline in days from today.
George will need one day to finish any of the assignments and he must
complete an assignment before it’s deadline in order to submit it
(he can’t complete it the day an assignment is due).
Help George figure out
the order in which he should complete his assignments
such that the total weight of the assignments he completes is maximized.
Input Specifications
DATA21.txt (DATA22.txt for the second try) will contain 10 datasets.
Each dataset begins with an integer N (1 ≤ N ≤ 1,000,000).
The next N lines contain an integer D and decimal W
(1 ≤ D ≤ 1,000,000; 0 < W ≤ 100), representing
an assignment that has a deadline that is D days from today
and a weight of W.
For the first seven cases, N ≤ 1000.
Output Specifications
For each dataset, output the maximum total weight of
the assignments that George can complete, rounded to 4 decimal places
(George is very meticulous about his grade).
Sample Input (Two Datasets Shown) Sample Output
3 3.0000
1 1.0 17.0000
2 1.0
3 1.0
5
1 2.0
1 1.0
3 3.0
7 10.0
3 2.0
(pixel raster of original Problem Statement (including it’s))
Solution #1:
In this solution I adopted the elimination approach.
I store the deadline along with the assignments due in this day in a dictionary.
Then I sort all the assignments (keys) and iterate sequentially, each time picking the highest d assignments where d is the deadline (because you cannot complete more than 1 assignment in a day, 3 in 3 days and so on).
I estimated the complexity to be O(dlogd + dwlogw).
def main():
from collections import defaultdict
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())
number_of_assignments = int(next(my_lines))
homework_dict = defaultdict(list)
for _ in range(number_of_assignments):
d, w = [float(i) for i in next(my_lines).split()]
d = int(d)
# Setting up the dictionary
homework_dict[d].append(w)
all_deadlines = list(homework_dict.keys())
all_deadlines.sort()
# Algorithm starts here
selected_assignments = []
for deadline in all_deadlines:
deadline_assignments = homework_dict[deadline]
deadline_assignments.extend(selected_assignments)
deadline_assignments.sort()
difference = len(deadline_assignments) - deadline
if difference < 0:
selected_assignments = deadline_assignments
else:
selected_assignments = deadline_assignments[difference::]
tot = sum(selected_assignments)
new = format(tot, ".4f")
print(new)
main()
Solution 2: In this solution I work directly,
I create a 2-dimensional list and sort in reverse order so that the weights are first.
Then I iterate through this list and look if it's possible to complete the current assignment by the deadline. I do this by creating a list of all the days and removing each deadline I already completed.
I estimate the complexity to be around O(nlogn + nd).
def main():
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())
n = int(next(my_lines))
def take_second(elem):
return elem[1]
biggest = 0
deadlines_weights_list = []
for i in range(n):
d, w = [float(x) for x in next(my_lines).split()]
d = int(d)
if d > biggest:
biggest = d
deadlines_weights_list.append([d, w])
deadlines_weights_list.sort(key=take_second, reverse=True)
possible_days = [day+1 for day in range(biggest)]
total = 0
for deadline, weight in deadlines_weights_list:
# If there are no days left the code should be terminated
if len(possible_days) == 0:
break
while deadline not in possible_days:
deadline -= 1
# This means the only days left are really high, with much more time.
if deadline < possible_days[0]:
break
if deadline in possible_days:
# One day cannot be used twice
possible_days.remove(deadline)
total += weight
total = format(total, ".4f")
print(total)
main()
I apologize for posting two codes in one review, I'm more concerned about finding an algorithm that is fast enough for all of test cases rather than comparing this codes. Both solutions work for 7 test cases out of 10 but exceed the time limit for the other 3. Would appreciate any suggestions to optimize this solutions or a completely new way to solve this challenge.
python sorting time-limit-exceeded dynamic-programming
New contributor
Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
I'm trying to solve ECOO 2018 round 2(regionals) question 2 to prepare for the upcoming contest. Basically, in the question George has to achieve the maximum possible grade by completing the right assignments. Each assignment has a deadline and a weight value. I've came up with two solutions yet both only work for 7 of the cases and are extremely slow for the other 3. Here is the problem statement:
Problem 2: Homework
George has procrastinated too much on his N homework assignments,
and now he is running out of time to finish them all.
Each of George’s N assignments has a weight that it contributes
to his grade and a deadline in days from today.
George will need one day to finish any of the assignments and he must
complete an assignment before it’s deadline in order to submit it
(he can’t complete it the day an assignment is due).
Help George figure out
the order in which he should complete his assignments
such that the total weight of the assignments he completes is maximized.
Input Specifications
DATA21.txt (DATA22.txt for the second try) will contain 10 datasets.
Each dataset begins with an integer N (1 ≤ N ≤ 1,000,000).
The next N lines contain an integer D and decimal W
(1 ≤ D ≤ 1,000,000; 0 < W ≤ 100), representing
an assignment that has a deadline that is D days from today
and a weight of W.
For the first seven cases, N ≤ 1000.
Output Specifications
For each dataset, output the maximum total weight of
the assignments that George can complete, rounded to 4 decimal places
(George is very meticulous about his grade).
Sample Input (Two Datasets Shown) Sample Output
3 3.0000
1 1.0 17.0000
2 1.0
3 1.0
5
1 2.0
1 1.0
3 3.0
7 10.0
3 2.0
(pixel raster of original Problem Statement (including it’s))
Solution #1:
In this solution I adopted the elimination approach.
I store the deadline along with the assignments due in this day in a dictionary.
Then I sort all the assignments (keys) and iterate sequentially, each time picking the highest d assignments where d is the deadline (because you cannot complete more than 1 assignment in a day, 3 in 3 days and so on).
I estimated the complexity to be O(dlogd + dwlogw).
def main():
from collections import defaultdict
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())
number_of_assignments = int(next(my_lines))
homework_dict = defaultdict(list)
for _ in range(number_of_assignments):
d, w = [float(i) for i in next(my_lines).split()]
d = int(d)
# Setting up the dictionary
homework_dict[d].append(w)
all_deadlines = list(homework_dict.keys())
all_deadlines.sort()
# Algorithm starts here
selected_assignments = []
for deadline in all_deadlines:
deadline_assignments = homework_dict[deadline]
deadline_assignments.extend(selected_assignments)
deadline_assignments.sort()
difference = len(deadline_assignments) - deadline
if difference < 0:
selected_assignments = deadline_assignments
else:
selected_assignments = deadline_assignments[difference::]
tot = sum(selected_assignments)
new = format(tot, ".4f")
print(new)
main()
Solution 2: In this solution I work directly,
I create a 2-dimensional list and sort in reverse order so that the weights are first.
Then I iterate through this list and look if it's possible to complete the current assignment by the deadline. I do this by creating a list of all the days and removing each deadline I already completed.
I estimate the complexity to be around O(nlogn + nd).
def main():
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())
n = int(next(my_lines))
def take_second(elem):
return elem[1]
biggest = 0
deadlines_weights_list = []
for i in range(n):
d, w = [float(x) for x in next(my_lines).split()]
d = int(d)
if d > biggest:
biggest = d
deadlines_weights_list.append([d, w])
deadlines_weights_list.sort(key=take_second, reverse=True)
possible_days = [day+1 for day in range(biggest)]
total = 0
for deadline, weight in deadlines_weights_list:
# If there are no days left the code should be terminated
if len(possible_days) == 0:
break
while deadline not in possible_days:
deadline -= 1
# This means the only days left are really high, with much more time.
if deadline < possible_days[0]:
break
if deadline in possible_days:
# One day cannot be used twice
possible_days.remove(deadline)
total += weight
total = format(total, ".4f")
print(total)
main()
I apologize for posting two codes in one review, I'm more concerned about finding an algorithm that is fast enough for all of test cases rather than comparing this codes. Both solutions work for 7 test cases out of 10 but exceed the time limit for the other 3. Would appreciate any suggestions to optimize this solutions or a completely new way to solve this challenge.
python sorting time-limit-exceeded dynamic-programming
New contributor
Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
$begingroup$
Please include the problem description as text not an image.
$endgroup$
– Peilonrayz
yesterday
$begingroup$
Okay I edited it
$endgroup$
– Michael exe
yesterday
$begingroup$
Welcome to Code Review!two codes in one reviewif you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).
$endgroup$
– greybeard
yesterday
$begingroup$
Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
$endgroup$
– Graipher
16 hours ago
$begingroup$
yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
$endgroup$
– Michael exe
12 hours ago
add a comment |
$begingroup$
I'm trying to solve ECOO 2018 round 2(regionals) question 2 to prepare for the upcoming contest. Basically, in the question George has to achieve the maximum possible grade by completing the right assignments. Each assignment has a deadline and a weight value. I've came up with two solutions yet both only work for 7 of the cases and are extremely slow for the other 3. Here is the problem statement:
Problem 2: Homework
George has procrastinated too much on his N homework assignments,
and now he is running out of time to finish them all.
Each of George’s N assignments has a weight that it contributes
to his grade and a deadline in days from today.
George will need one day to finish any of the assignments and he must
complete an assignment before it’s deadline in order to submit it
(he can’t complete it the day an assignment is due).
Help George figure out
the order in which he should complete his assignments
such that the total weight of the assignments he completes is maximized.
Input Specifications
DATA21.txt (DATA22.txt for the second try) will contain 10 datasets.
Each dataset begins with an integer N (1 ≤ N ≤ 1,000,000).
The next N lines contain an integer D and decimal W
(1 ≤ D ≤ 1,000,000; 0 < W ≤ 100), representing
an assignment that has a deadline that is D days from today
and a weight of W.
For the first seven cases, N ≤ 1000.
Output Specifications
For each dataset, output the maximum total weight of
the assignments that George can complete, rounded to 4 decimal places
(George is very meticulous about his grade).
Sample Input (Two Datasets Shown) Sample Output
3 3.0000
1 1.0 17.0000
2 1.0
3 1.0
5
1 2.0
1 1.0
3 3.0
7 10.0
3 2.0
(pixel raster of original Problem Statement (including it’s))
Solution #1:
In this solution I adopted the elimination approach.
I store the deadline along with the assignments due in this day in a dictionary.
Then I sort all the assignments (keys) and iterate sequentially, each time picking the highest d assignments where d is the deadline (because you cannot complete more than 1 assignment in a day, 3 in 3 days and so on).
I estimated the complexity to be O(dlogd + dwlogw).
def main():
from collections import defaultdict
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())
number_of_assignments = int(next(my_lines))
homework_dict = defaultdict(list)
for _ in range(number_of_assignments):
d, w = [float(i) for i in next(my_lines).split()]
d = int(d)
# Setting up the dictionary
homework_dict[d].append(w)
all_deadlines = list(homework_dict.keys())
all_deadlines.sort()
# Algorithm starts here
selected_assignments = []
for deadline in all_deadlines:
deadline_assignments = homework_dict[deadline]
deadline_assignments.extend(selected_assignments)
deadline_assignments.sort()
difference = len(deadline_assignments) - deadline
if difference < 0:
selected_assignments = deadline_assignments
else:
selected_assignments = deadline_assignments[difference::]
tot = sum(selected_assignments)
new = format(tot, ".4f")
print(new)
main()
Solution 2: In this solution I work directly,
I create a 2-dimensional list and sort in reverse order so that the weights are first.
Then I iterate through this list and look if it's possible to complete the current assignment by the deadline. I do this by creating a list of all the days and removing each deadline I already completed.
I estimate the complexity to be around O(nlogn + nd).
def main():
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())
n = int(next(my_lines))
def take_second(elem):
return elem[1]
biggest = 0
deadlines_weights_list = []
for i in range(n):
d, w = [float(x) for x in next(my_lines).split()]
d = int(d)
if d > biggest:
biggest = d
deadlines_weights_list.append([d, w])
deadlines_weights_list.sort(key=take_second, reverse=True)
possible_days = [day+1 for day in range(biggest)]
total = 0
for deadline, weight in deadlines_weights_list:
# If there are no days left the code should be terminated
if len(possible_days) == 0:
break
while deadline not in possible_days:
deadline -= 1
# This means the only days left are really high, with much more time.
if deadline < possible_days[0]:
break
if deadline in possible_days:
# One day cannot be used twice
possible_days.remove(deadline)
total += weight
total = format(total, ".4f")
print(total)
main()
I apologize for posting two codes in one review, I'm more concerned about finding an algorithm that is fast enough for all of test cases rather than comparing this codes. Both solutions work for 7 test cases out of 10 but exceed the time limit for the other 3. Would appreciate any suggestions to optimize this solutions or a completely new way to solve this challenge.
python sorting time-limit-exceeded dynamic-programming
New contributor
Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
I'm trying to solve ECOO 2018 round 2(regionals) question 2 to prepare for the upcoming contest. Basically, in the question George has to achieve the maximum possible grade by completing the right assignments. Each assignment has a deadline and a weight value. I've came up with two solutions yet both only work for 7 of the cases and are extremely slow for the other 3. Here is the problem statement:
Problem 2: Homework
George has procrastinated too much on his N homework assignments,
and now he is running out of time to finish them all.
Each of George’s N assignments has a weight that it contributes
to his grade and a deadline in days from today.
George will need one day to finish any of the assignments and he must
complete an assignment before it’s deadline in order to submit it
(he can’t complete it the day an assignment is due).
Help George figure out
the order in which he should complete his assignments
such that the total weight of the assignments he completes is maximized.
Input Specifications
DATA21.txt (DATA22.txt for the second try) will contain 10 datasets.
Each dataset begins with an integer N (1 ≤ N ≤ 1,000,000).
The next N lines contain an integer D and decimal W
(1 ≤ D ≤ 1,000,000; 0 < W ≤ 100), representing
an assignment that has a deadline that is D days from today
and a weight of W.
For the first seven cases, N ≤ 1000.
Output Specifications
For each dataset, output the maximum total weight of
the assignments that George can complete, rounded to 4 decimal places
(George is very meticulous about his grade).
Sample Input (Two Datasets Shown) Sample Output
3 3.0000
1 1.0 17.0000
2 1.0
3 1.0
5
1 2.0
1 1.0
3 3.0
7 10.0
3 2.0
(pixel raster of original Problem Statement (including it’s))
Solution #1:
In this solution I adopted the elimination approach.
I store the deadline along with the assignments due in this day in a dictionary.
Then I sort all the assignments (keys) and iterate sequentially, each time picking the highest d assignments where d is the deadline (because you cannot complete more than 1 assignment in a day, 3 in 3 days and so on).
I estimated the complexity to be O(dlogd + dwlogw).
def main():
from collections import defaultdict
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())
number_of_assignments = int(next(my_lines))
homework_dict = defaultdict(list)
for _ in range(number_of_assignments):
d, w = [float(i) for i in next(my_lines).split()]
d = int(d)
# Setting up the dictionary
homework_dict[d].append(w)
all_deadlines = list(homework_dict.keys())
all_deadlines.sort()
# Algorithm starts here
selected_assignments = []
for deadline in all_deadlines:
deadline_assignments = homework_dict[deadline]
deadline_assignments.extend(selected_assignments)
deadline_assignments.sort()
difference = len(deadline_assignments) - deadline
if difference < 0:
selected_assignments = deadline_assignments
else:
selected_assignments = deadline_assignments[difference::]
tot = sum(selected_assignments)
new = format(tot, ".4f")
print(new)
main()
Solution 2: In this solution I work directly,
I create a 2-dimensional list and sort in reverse order so that the weights are first.
Then I iterate through this list and look if it's possible to complete the current assignment by the deadline. I do this by creating a list of all the days and removing each deadline I already completed.
I estimate the complexity to be around O(nlogn + nd).
def main():
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())
n = int(next(my_lines))
def take_second(elem):
return elem[1]
biggest = 0
deadlines_weights_list = []
for i in range(n):
d, w = [float(x) for x in next(my_lines).split()]
d = int(d)
if d > biggest:
biggest = d
deadlines_weights_list.append([d, w])
deadlines_weights_list.sort(key=take_second, reverse=True)
possible_days = [day+1 for day in range(biggest)]
total = 0
for deadline, weight in deadlines_weights_list:
# If there are no days left the code should be terminated
if len(possible_days) == 0:
break
while deadline not in possible_days:
deadline -= 1
# This means the only days left are really high, with much more time.
if deadline < possible_days[0]:
break
if deadline in possible_days:
# One day cannot be used twice
possible_days.remove(deadline)
total += weight
total = format(total, ".4f")
print(total)
main()
I apologize for posting two codes in one review, I'm more concerned about finding an algorithm that is fast enough for all of test cases rather than comparing this codes. Both solutions work for 7 test cases out of 10 but exceed the time limit for the other 3. Would appreciate any suggestions to optimize this solutions or a completely new way to solve this challenge.
python sorting time-limit-exceeded dynamic-programming
python sorting time-limit-exceeded dynamic-programming
New contributor
Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited 12 hours ago
Michael exe
New contributor
Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
asked yesterday
Michael exeMichael exe
112
112
New contributor
Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$begingroup$
Please include the problem description as text not an image.
$endgroup$
– Peilonrayz
yesterday
$begingroup$
Okay I edited it
$endgroup$
– Michael exe
yesterday
$begingroup$
Welcome to Code Review!two codes in one reviewif you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).
$endgroup$
– greybeard
yesterday
$begingroup$
Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
$endgroup$
– Graipher
16 hours ago
$begingroup$
yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
$endgroup$
– Michael exe
12 hours ago
add a comment |
$begingroup$
Please include the problem description as text not an image.
$endgroup$
– Peilonrayz
yesterday
$begingroup$
Okay I edited it
$endgroup$
– Michael exe
yesterday
$begingroup$
Welcome to Code Review!two codes in one reviewif you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).
$endgroup$
– greybeard
yesterday
$begingroup$
Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
$endgroup$
– Graipher
16 hours ago
$begingroup$
yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
$endgroup$
– Michael exe
12 hours ago
$begingroup$
Please include the problem description as text not an image.
$endgroup$
– Peilonrayz
yesterday
$begingroup$
Please include the problem description as text not an image.
$endgroup$
– Peilonrayz
yesterday
$begingroup$
Okay I edited it
$endgroup$
– Michael exe
yesterday
$begingroup$
Okay I edited it
$endgroup$
– Michael exe
yesterday
$begingroup$
Welcome to Code Review!
two codes in one review if you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).$endgroup$
– greybeard
yesterday
$begingroup$
Welcome to Code Review!
two codes in one review if you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).$endgroup$
– greybeard
yesterday
$begingroup$
Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
$endgroup$
– Graipher
16 hours ago
$begingroup$
Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
$endgroup$
– Graipher
16 hours ago
$begingroup$
yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
$endgroup$
– Michael exe
12 hours ago
$begingroup$
yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
$endgroup$
– Michael exe
12 hours ago
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Approach 1
def main():
... monolithic method ...
main()
Either this is a throwaway program, in which case you don't need to define main at all, or it isn't, in which case you should only call main() if __name__ == "__main__".
from collections import defaultdict
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())
number_of_assignments = int(next(my_lines))
homework_dict = defaultdict(list)
for _ in range(number_of_assignments):
d, w = [float(i) for i in next(my_lines).split()]
d = int(d)
# Setting up the dictionary
homework_dict[d].append(w)
This looks to me like a method which should be factored out, so that there is clear isolation between the I/O and the processing.
I don't understand the use of float and int. Would it not be clearer to avoid the double-coercion with something like this?
d, w = next(my_lines).split()
homework_dict[int(d)].append(float(w))
all_deadlines = list(homework_dict.keys())
all_deadlines.sort()
I'm not sure what all_ adds to the name. deadlines works for me, or distinct_deadlines to be more explicit.
# Algorithm starts here
See my previous point about factoring out methods.
selected_assignments = []
for deadline in all_deadlines:
deadline_assignments = homework_dict[deadline]
deadline_assignments.extend(selected_assignments)
deadline_assignments.sort()
This is going to be doing a lot of sorts of already sorted data. That seems like a plausible bottleneck. I see at least three obviously better approaches:
- Sort
homework_dict[deadline]and then merge two sorted lists in linear time. - Use a priority queue of some kind for
selected_assignments. Addhomework_dict[deadline]to the queue and then pop it down to the desired length. - The asymptotically best option that I can see, although possibly not the fastest in practice, would be to append the new assignments and then use a linear time median finder to slice the merged list.
difference = len(deadline_assignments) - deadline
if difference < 0:
selected_assignments = deadline_assignments
else:
selected_assignments = deadline_assignments[difference::]
difference is not an informative name. What does the difference actually mean? It's something like "number of assignments to discard". I might use excess as a name, or num_excess_assignments if I wanted to be more explicit.
It would be nice to see some comments in the code explaining why it's correct. I would say that having read it carefully a few times I find its correctness plausible, but I wouldn't be willing to guarantee it.
Approach 2
A lot of the comments (about structure, names, proof of correctness, ...) on approach 1 apply here too.
possible_days = [day+1 for day in range(biggest)]
What is possible_days used for? in possible_days and possible_days.remove. This is a classic performance blunder (so don't feel bad, because you're far from the first person to stumble into it: clearly there are popular learning resources which should address this and don't). In general, the appropriate data structure for this use case is set, not list.
while deadline not in possible_days:
deadline -= 1
# This means the only days left are really high, with much more time.
if deadline < possible_days[0]:
break
However, it turns out that the use of possible_days is even more specialised than it seemed at first. Even if you replace the list with a set, this would still be a linear scan, and it doesn't need to be. You want the largest value smaller than a given one: if you keep the values in order then a binary chop will get there in logarithmic time rather than linear.
if deadline in possible_days:
# One day cannot be used twice
possible_days.remove(deadline)
total += weight
However, updating the array needed for binary chop would not be logarithmic time. What you need is some kind of tree which allows searches and removals in logarithmic time. The access pattern is not going to be random, so to guarantee logarithmic time it probably needs to be a self-balancing tree, or some kind of self-pruning trie. I'm not aware of a suitable implementation in the standard library: I think you would have to write it yourself.
Actually, thinking about it some more, I reckon you could use an augmented union-find data structure to get amortised linear time. Initially each index (and -1 as a sentinel) is in a singleton set with associated data equal to the index. The search and delete becomes (pseudocode):
available = find(deadline).data
if available >= 0:
total += weight
union(available, available - 1)
ensuring that union chooses the smaller of the two associated data items.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Michael exe is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f216115%2ffinding-maximum-homework-grade-ecoo-2018%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Approach 1
def main():
... monolithic method ...
main()
Either this is a throwaway program, in which case you don't need to define main at all, or it isn't, in which case you should only call main() if __name__ == "__main__".
from collections import defaultdict
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())
number_of_assignments = int(next(my_lines))
homework_dict = defaultdict(list)
for _ in range(number_of_assignments):
d, w = [float(i) for i in next(my_lines).split()]
d = int(d)
# Setting up the dictionary
homework_dict[d].append(w)
This looks to me like a method which should be factored out, so that there is clear isolation between the I/O and the processing.
I don't understand the use of float and int. Would it not be clearer to avoid the double-coercion with something like this?
d, w = next(my_lines).split()
homework_dict[int(d)].append(float(w))
all_deadlines = list(homework_dict.keys())
all_deadlines.sort()
I'm not sure what all_ adds to the name. deadlines works for me, or distinct_deadlines to be more explicit.
# Algorithm starts here
See my previous point about factoring out methods.
selected_assignments = []
for deadline in all_deadlines:
deadline_assignments = homework_dict[deadline]
deadline_assignments.extend(selected_assignments)
deadline_assignments.sort()
This is going to be doing a lot of sorts of already sorted data. That seems like a plausible bottleneck. I see at least three obviously better approaches:
- Sort
homework_dict[deadline]and then merge two sorted lists in linear time. - Use a priority queue of some kind for
selected_assignments. Addhomework_dict[deadline]to the queue and then pop it down to the desired length. - The asymptotically best option that I can see, although possibly not the fastest in practice, would be to append the new assignments and then use a linear time median finder to slice the merged list.
difference = len(deadline_assignments) - deadline
if difference < 0:
selected_assignments = deadline_assignments
else:
selected_assignments = deadline_assignments[difference::]
difference is not an informative name. What does the difference actually mean? It's something like "number of assignments to discard". I might use excess as a name, or num_excess_assignments if I wanted to be more explicit.
It would be nice to see some comments in the code explaining why it's correct. I would say that having read it carefully a few times I find its correctness plausible, but I wouldn't be willing to guarantee it.
Approach 2
A lot of the comments (about structure, names, proof of correctness, ...) on approach 1 apply here too.
possible_days = [day+1 for day in range(biggest)]
What is possible_days used for? in possible_days and possible_days.remove. This is a classic performance blunder (so don't feel bad, because you're far from the first person to stumble into it: clearly there are popular learning resources which should address this and don't). In general, the appropriate data structure for this use case is set, not list.
while deadline not in possible_days:
deadline -= 1
# This means the only days left are really high, with much more time.
if deadline < possible_days[0]:
break
However, it turns out that the use of possible_days is even more specialised than it seemed at first. Even if you replace the list with a set, this would still be a linear scan, and it doesn't need to be. You want the largest value smaller than a given one: if you keep the values in order then a binary chop will get there in logarithmic time rather than linear.
if deadline in possible_days:
# One day cannot be used twice
possible_days.remove(deadline)
total += weight
However, updating the array needed for binary chop would not be logarithmic time. What you need is some kind of tree which allows searches and removals in logarithmic time. The access pattern is not going to be random, so to guarantee logarithmic time it probably needs to be a self-balancing tree, or some kind of self-pruning trie. I'm not aware of a suitable implementation in the standard library: I think you would have to write it yourself.
Actually, thinking about it some more, I reckon you could use an augmented union-find data structure to get amortised linear time. Initially each index (and -1 as a sentinel) is in a singleton set with associated data equal to the index. The search and delete becomes (pseudocode):
available = find(deadline).data
if available >= 0:
total += weight
union(available, available - 1)
ensuring that union chooses the smaller of the two associated data items.
$endgroup$
add a comment |
$begingroup$
Approach 1
def main():
... monolithic method ...
main()
Either this is a throwaway program, in which case you don't need to define main at all, or it isn't, in which case you should only call main() if __name__ == "__main__".
from collections import defaultdict
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())
number_of_assignments = int(next(my_lines))
homework_dict = defaultdict(list)
for _ in range(number_of_assignments):
d, w = [float(i) for i in next(my_lines).split()]
d = int(d)
# Setting up the dictionary
homework_dict[d].append(w)
This looks to me like a method which should be factored out, so that there is clear isolation between the I/O and the processing.
I don't understand the use of float and int. Would it not be clearer to avoid the double-coercion with something like this?
d, w = next(my_lines).split()
homework_dict[int(d)].append(float(w))
all_deadlines = list(homework_dict.keys())
all_deadlines.sort()
I'm not sure what all_ adds to the name. deadlines works for me, or distinct_deadlines to be more explicit.
# Algorithm starts here
See my previous point about factoring out methods.
selected_assignments = []
for deadline in all_deadlines:
deadline_assignments = homework_dict[deadline]
deadline_assignments.extend(selected_assignments)
deadline_assignments.sort()
This is going to be doing a lot of sorts of already sorted data. That seems like a plausible bottleneck. I see at least three obviously better approaches:
- Sort
homework_dict[deadline]and then merge two sorted lists in linear time. - Use a priority queue of some kind for
selected_assignments. Addhomework_dict[deadline]to the queue and then pop it down to the desired length. - The asymptotically best option that I can see, although possibly not the fastest in practice, would be to append the new assignments and then use a linear time median finder to slice the merged list.
difference = len(deadline_assignments) - deadline
if difference < 0:
selected_assignments = deadline_assignments
else:
selected_assignments = deadline_assignments[difference::]
difference is not an informative name. What does the difference actually mean? It's something like "number of assignments to discard". I might use excess as a name, or num_excess_assignments if I wanted to be more explicit.
It would be nice to see some comments in the code explaining why it's correct. I would say that having read it carefully a few times I find its correctness plausible, but I wouldn't be willing to guarantee it.
Approach 2
A lot of the comments (about structure, names, proof of correctness, ...) on approach 1 apply here too.
possible_days = [day+1 for day in range(biggest)]
What is possible_days used for? in possible_days and possible_days.remove. This is a classic performance blunder (so don't feel bad, because you're far from the first person to stumble into it: clearly there are popular learning resources which should address this and don't). In general, the appropriate data structure for this use case is set, not list.
while deadline not in possible_days:
deadline -= 1
# This means the only days left are really high, with much more time.
if deadline < possible_days[0]:
break
However, it turns out that the use of possible_days is even more specialised than it seemed at first. Even if you replace the list with a set, this would still be a linear scan, and it doesn't need to be. You want the largest value smaller than a given one: if you keep the values in order then a binary chop will get there in logarithmic time rather than linear.
if deadline in possible_days:
# One day cannot be used twice
possible_days.remove(deadline)
total += weight
However, updating the array needed for binary chop would not be logarithmic time. What you need is some kind of tree which allows searches and removals in logarithmic time. The access pattern is not going to be random, so to guarantee logarithmic time it probably needs to be a self-balancing tree, or some kind of self-pruning trie. I'm not aware of a suitable implementation in the standard library: I think you would have to write it yourself.
Actually, thinking about it some more, I reckon you could use an augmented union-find data structure to get amortised linear time. Initially each index (and -1 as a sentinel) is in a singleton set with associated data equal to the index. The search and delete becomes (pseudocode):
available = find(deadline).data
if available >= 0:
total += weight
union(available, available - 1)
ensuring that union chooses the smaller of the two associated data items.
$endgroup$
add a comment |
$begingroup$
Approach 1
def main():
... monolithic method ...
main()
Either this is a throwaway program, in which case you don't need to define main at all, or it isn't, in which case you should only call main() if __name__ == "__main__".
from collections import defaultdict
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())
number_of_assignments = int(next(my_lines))
homework_dict = defaultdict(list)
for _ in range(number_of_assignments):
d, w = [float(i) for i in next(my_lines).split()]
d = int(d)
# Setting up the dictionary
homework_dict[d].append(w)
This looks to me like a method which should be factored out, so that there is clear isolation between the I/O and the processing.
I don't understand the use of float and int. Would it not be clearer to avoid the double-coercion with something like this?
d, w = next(my_lines).split()
homework_dict[int(d)].append(float(w))
all_deadlines = list(homework_dict.keys())
all_deadlines.sort()
I'm not sure what all_ adds to the name. deadlines works for me, or distinct_deadlines to be more explicit.
# Algorithm starts here
See my previous point about factoring out methods.
selected_assignments = []
for deadline in all_deadlines:
deadline_assignments = homework_dict[deadline]
deadline_assignments.extend(selected_assignments)
deadline_assignments.sort()
This is going to be doing a lot of sorts of already sorted data. That seems like a plausible bottleneck. I see at least three obviously better approaches:
- Sort
homework_dict[deadline]and then merge two sorted lists in linear time. - Use a priority queue of some kind for
selected_assignments. Addhomework_dict[deadline]to the queue and then pop it down to the desired length. - The asymptotically best option that I can see, although possibly not the fastest in practice, would be to append the new assignments and then use a linear time median finder to slice the merged list.
difference = len(deadline_assignments) - deadline
if difference < 0:
selected_assignments = deadline_assignments
else:
selected_assignments = deadline_assignments[difference::]
difference is not an informative name. What does the difference actually mean? It's something like "number of assignments to discard". I might use excess as a name, or num_excess_assignments if I wanted to be more explicit.
It would be nice to see some comments in the code explaining why it's correct. I would say that having read it carefully a few times I find its correctness plausible, but I wouldn't be willing to guarantee it.
Approach 2
A lot of the comments (about structure, names, proof of correctness, ...) on approach 1 apply here too.
possible_days = [day+1 for day in range(biggest)]
What is possible_days used for? in possible_days and possible_days.remove. This is a classic performance blunder (so don't feel bad, because you're far from the first person to stumble into it: clearly there are popular learning resources which should address this and don't). In general, the appropriate data structure for this use case is set, not list.
while deadline not in possible_days:
deadline -= 1
# This means the only days left are really high, with much more time.
if deadline < possible_days[0]:
break
However, it turns out that the use of possible_days is even more specialised than it seemed at first. Even if you replace the list with a set, this would still be a linear scan, and it doesn't need to be. You want the largest value smaller than a given one: if you keep the values in order then a binary chop will get there in logarithmic time rather than linear.
if deadline in possible_days:
# One day cannot be used twice
possible_days.remove(deadline)
total += weight
However, updating the array needed for binary chop would not be logarithmic time. What you need is some kind of tree which allows searches and removals in logarithmic time. The access pattern is not going to be random, so to guarantee logarithmic time it probably needs to be a self-balancing tree, or some kind of self-pruning trie. I'm not aware of a suitable implementation in the standard library: I think you would have to write it yourself.
Actually, thinking about it some more, I reckon you could use an augmented union-find data structure to get amortised linear time. Initially each index (and -1 as a sentinel) is in a singleton set with associated data equal to the index. The search and delete becomes (pseudocode):
available = find(deadline).data
if available >= 0:
total += weight
union(available, available - 1)
ensuring that union chooses the smaller of the two associated data items.
$endgroup$
Approach 1
def main():
... monolithic method ...
main()
Either this is a throwaway program, in which case you don't need to define main at all, or it isn't, in which case you should only call main() if __name__ == "__main__".
from collections import defaultdict
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())
number_of_assignments = int(next(my_lines))
homework_dict = defaultdict(list)
for _ in range(number_of_assignments):
d, w = [float(i) for i in next(my_lines).split()]
d = int(d)
# Setting up the dictionary
homework_dict[d].append(w)
This looks to me like a method which should be factored out, so that there is clear isolation between the I/O and the processing.
I don't understand the use of float and int. Would it not be clearer to avoid the double-coercion with something like this?
d, w = next(my_lines).split()
homework_dict[int(d)].append(float(w))
all_deadlines = list(homework_dict.keys())
all_deadlines.sort()
I'm not sure what all_ adds to the name. deadlines works for me, or distinct_deadlines to be more explicit.
# Algorithm starts here
See my previous point about factoring out methods.
selected_assignments = []
for deadline in all_deadlines:
deadline_assignments = homework_dict[deadline]
deadline_assignments.extend(selected_assignments)
deadline_assignments.sort()
This is going to be doing a lot of sorts of already sorted data. That seems like a plausible bottleneck. I see at least three obviously better approaches:
- Sort
homework_dict[deadline]and then merge two sorted lists in linear time. - Use a priority queue of some kind for
selected_assignments. Addhomework_dict[deadline]to the queue and then pop it down to the desired length. - The asymptotically best option that I can see, although possibly not the fastest in practice, would be to append the new assignments and then use a linear time median finder to slice the merged list.
difference = len(deadline_assignments) - deadline
if difference < 0:
selected_assignments = deadline_assignments
else:
selected_assignments = deadline_assignments[difference::]
difference is not an informative name. What does the difference actually mean? It's something like "number of assignments to discard". I might use excess as a name, or num_excess_assignments if I wanted to be more explicit.
It would be nice to see some comments in the code explaining why it's correct. I would say that having read it carefully a few times I find its correctness plausible, but I wouldn't be willing to guarantee it.
Approach 2
A lot of the comments (about structure, names, proof of correctness, ...) on approach 1 apply here too.
possible_days = [day+1 for day in range(biggest)]
What is possible_days used for? in possible_days and possible_days.remove. This is a classic performance blunder (so don't feel bad, because you're far from the first person to stumble into it: clearly there are popular learning resources which should address this and don't). In general, the appropriate data structure for this use case is set, not list.
while deadline not in possible_days:
deadline -= 1
# This means the only days left are really high, with much more time.
if deadline < possible_days[0]:
break
However, it turns out that the use of possible_days is even more specialised than it seemed at first. Even if you replace the list with a set, this would still be a linear scan, and it doesn't need to be. You want the largest value smaller than a given one: if you keep the values in order then a binary chop will get there in logarithmic time rather than linear.
if deadline in possible_days:
# One day cannot be used twice
possible_days.remove(deadline)
total += weight
However, updating the array needed for binary chop would not be logarithmic time. What you need is some kind of tree which allows searches and removals in logarithmic time. The access pattern is not going to be random, so to guarantee logarithmic time it probably needs to be a self-balancing tree, or some kind of self-pruning trie. I'm not aware of a suitable implementation in the standard library: I think you would have to write it yourself.
Actually, thinking about it some more, I reckon you could use an augmented union-find data structure to get amortised linear time. Initially each index (and -1 as a sentinel) is in a singleton set with associated data equal to the index. The search and delete becomes (pseudocode):
available = find(deadline).data
if available >= 0:
total += weight
union(available, available - 1)
ensuring that union chooses the smaller of the two associated data items.
edited 6 hours ago
answered 9 hours ago
Peter TaylorPeter Taylor
18.1k2963
18.1k2963
add a comment |
add a comment |
Michael exe is a new contributor. Be nice, and check out our Code of Conduct.
Michael exe is a new contributor. Be nice, and check out our Code of Conduct.
Michael exe is a new contributor. Be nice, and check out our Code of Conduct.
Michael exe is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f216115%2ffinding-maximum-homework-grade-ecoo-2018%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Please include the problem description as text not an image.
$endgroup$
– Peilonrayz
yesterday
$begingroup$
Okay I edited it
$endgroup$
– Michael exe
yesterday
$begingroup$
Welcome to Code Review!
two codes in one reviewif you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).$endgroup$
– greybeard
yesterday
$begingroup$
Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
$endgroup$
– Graipher
16 hours ago
$begingroup$
yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
$endgroup$
– Michael exe
12 hours ago