Finding maximum homework grade ECOO 2018Vitaly & Pie Programming challengeCounting number of...

Could solar power be utilized and substitute coal in the 19th century?

What is the term when two people sing in harmony, but they aren't singing the same notes?

How will losing mobility of one hand affect my career as a programmer?

Simple image editor tool to draw a simple box/rectangle in an existing image

Can somebody explain Brexit in a few child-proof sentences?

I'm in charge of equipment buying but no one's ever happy with what I choose. How to fix this?

Would it be legal for a US State to ban exports of a natural resource?

Freedom of speech and where it applies

Partial sums of primes

Can a controlled ghast be a leader of a pack of ghouls?

Superhero words!

How did Monica know how to operate Carol's "designer"?

Why isn't KTEX's runway designation 10/28 instead of 9/27?

node command while defining a coordinate in TikZ

For airliners, what prevents wing strikes on landing in bad weather?

Hostile work environment after whistle-blowing on coworker and our boss. What do I do?

Is infinity mathematically observable?

word describing multiple paths to the same abstract outcome

What if somebody invests in my application?

Proof of Lemma: Every integer can be written as a product of primes

Can I use my Chinese passport to enter China after I acquired another citizenship?

Is there any significance to the Valyrian Stone vault door of Qarth?

How can I raise concerns with a new DM about XP splitting?

Why does this part of the Space Shuttle launch pad seem to be floating in air?



Finding maximum homework grade ECOO 2018


Vitaly & Pie Programming challengeCounting number of inversionsProcessing data for five weather stationsFinding the maximum distance from the starting nodeFinding the maximum element of a StackFinding the maximum GCD of all pairsRegression on Pandas DataFrameTaxi fare calculator, trip recorder, and fuel calculatorContest Solution: Al Gore RhythmHackerRank: Fraudulent activity notification













2












$begingroup$


I'm trying to solve ECOO 2018 round 2(regionals) question 2 to prepare for the upcoming contest. Basically, in the question George has to achieve the maximum possible grade by completing the right assignments. Each assignment has a deadline and a weight value. I've came up with two solutions yet both only work for 7 of the cases and are extremely slow for the other 3. Here is the problem statement:




Problem 2: Homework





George has procrastinated too much on his N homework assignments,
and now he is running out of time to finish them all.

Each of George’s N assignments has a weight that it contributes
to his grade and a deadline in days from today.
George will need one day to finish any of the assignments and he must
complete an assignment before it’s deadline in order to submit it
(he can’t complete it the day an assignment is due).

Help George figure out
the order in which he should complete his assignments
such that the total weight of the assignments he completes is maximized.



Input Specifications

DATA21.txt (DATA22.txt for the second try) will contain 10 datasets.
Each dataset begins with an integer N (1 ≤ N ≤ 1,000,000).



The next N lines contain an integer D and decimal W
(1 ≤ D ≤ 1,000,000; 0 < W ≤ 100), representing
an assignment that has a deadline that is D days from today
and a weight of W.



For the first seven cases, N ≤ 1000.



Output Specifications

For each dataset, output the maximum total weight of
the assignments that George can complete, rounded to 4 decimal places
(George is very meticulous about his grade).



Sample Input (Two Datasets Shown) Sample Output



3                                   3.0000
1 1.0 17.0000
2 1.0
3 1.0
5
1 2.0
1 1.0
3 3.0
7 10.0
3 2.0



(pixel raster of original Problem Statement (including it’s))



Solution #1:
In this solution I adopted the elimination approach.

I store the deadline along with the assignments due in this day in a dictionary.

Then I sort all the assignments (keys) and iterate sequentially, each time picking the highest d assignments where d is the deadline (because you cannot complete more than 1 assignment in a day, 3 in 3 days and so on).

I estimated the complexity to be O(dlogd + dwlogw).



def main():
from collections import defaultdict

with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())

number_of_assignments = int(next(my_lines))

homework_dict = defaultdict(list)

for _ in range(number_of_assignments):
d, w = [float(i) for i in next(my_lines).split()]
d = int(d)
# Setting up the dictionary

homework_dict[d].append(w)


all_deadlines = list(homework_dict.keys())

all_deadlines.sort()

# Algorithm starts here
selected_assignments = []
for deadline in all_deadlines:
deadline_assignments = homework_dict[deadline]
deadline_assignments.extend(selected_assignments)
deadline_assignments.sort()

difference = len(deadline_assignments) - deadline
if difference < 0:
selected_assignments = deadline_assignments
else:
selected_assignments = deadline_assignments[difference::]

tot = sum(selected_assignments)
new = format(tot, ".4f")
print(new)

main()


Solution 2: In this solution I work directly,

I create a 2-dimensional list and sort in reverse order so that the weights are first.

Then I iterate through this list and look if it's possible to complete the current assignment by the deadline. I do this by creating a list of all the days and removing each deadline I already completed.

I estimate the complexity to be around O(nlogn + nd).



def main():
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())

n = int(next(my_lines))

def take_second(elem):
return elem[1]

biggest = 0
deadlines_weights_list = []
for i in range(n):
d, w = [float(x) for x in next(my_lines).split()]
d = int(d)
if d > biggest:
biggest = d
deadlines_weights_list.append([d, w])

deadlines_weights_list.sort(key=take_second, reverse=True)

possible_days = [day+1 for day in range(biggest)]

total = 0
for deadline, weight in deadlines_weights_list:
# If there are no days left the code should be terminated
if len(possible_days) == 0:
break

while deadline not in possible_days:
deadline -= 1
# This means the only days left are really high, with much more time.
if deadline < possible_days[0]:
break

if deadline in possible_days:
# One day cannot be used twice
possible_days.remove(deadline)
total += weight


total = format(total, ".4f")
print(total)

main()


I apologize for posting two codes in one review, I'm more concerned about finding an algorithm that is fast enough for all of test cases rather than comparing this codes. Both solutions work for 7 test cases out of 10 but exceed the time limit for the other 3. Would appreciate any suggestions to optimize this solutions or a completely new way to solve this challenge.










share|improve this question









New contributor




Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    Please include the problem description as text not an image.
    $endgroup$
    – Peilonrayz
    yesterday










  • $begingroup$
    Okay I edited it
    $endgroup$
    – Michael exe
    yesterday










  • $begingroup$
    Welcome to Code Review! two codes in one review if you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).
    $endgroup$
    – greybeard
    yesterday










  • $begingroup$
    Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
    $endgroup$
    – Graipher
    16 hours ago












  • $begingroup$
    yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
    $endgroup$
    – Michael exe
    12 hours ago
















2












$begingroup$


I'm trying to solve ECOO 2018 round 2(regionals) question 2 to prepare for the upcoming contest. Basically, in the question George has to achieve the maximum possible grade by completing the right assignments. Each assignment has a deadline and a weight value. I've came up with two solutions yet both only work for 7 of the cases and are extremely slow for the other 3. Here is the problem statement:




Problem 2: Homework





George has procrastinated too much on his N homework assignments,
and now he is running out of time to finish them all.

Each of George’s N assignments has a weight that it contributes
to his grade and a deadline in days from today.
George will need one day to finish any of the assignments and he must
complete an assignment before it’s deadline in order to submit it
(he can’t complete it the day an assignment is due).

Help George figure out
the order in which he should complete his assignments
such that the total weight of the assignments he completes is maximized.



Input Specifications

DATA21.txt (DATA22.txt for the second try) will contain 10 datasets.
Each dataset begins with an integer N (1 ≤ N ≤ 1,000,000).



The next N lines contain an integer D and decimal W
(1 ≤ D ≤ 1,000,000; 0 < W ≤ 100), representing
an assignment that has a deadline that is D days from today
and a weight of W.



For the first seven cases, N ≤ 1000.



Output Specifications

For each dataset, output the maximum total weight of
the assignments that George can complete, rounded to 4 decimal places
(George is very meticulous about his grade).



Sample Input (Two Datasets Shown) Sample Output



3                                   3.0000
1 1.0 17.0000
2 1.0
3 1.0
5
1 2.0
1 1.0
3 3.0
7 10.0
3 2.0



(pixel raster of original Problem Statement (including it’s))



Solution #1:
In this solution I adopted the elimination approach.

I store the deadline along with the assignments due in this day in a dictionary.

Then I sort all the assignments (keys) and iterate sequentially, each time picking the highest d assignments where d is the deadline (because you cannot complete more than 1 assignment in a day, 3 in 3 days and so on).

I estimated the complexity to be O(dlogd + dwlogw).



def main():
from collections import defaultdict

with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())

number_of_assignments = int(next(my_lines))

homework_dict = defaultdict(list)

for _ in range(number_of_assignments):
d, w = [float(i) for i in next(my_lines).split()]
d = int(d)
# Setting up the dictionary

homework_dict[d].append(w)


all_deadlines = list(homework_dict.keys())

all_deadlines.sort()

# Algorithm starts here
selected_assignments = []
for deadline in all_deadlines:
deadline_assignments = homework_dict[deadline]
deadline_assignments.extend(selected_assignments)
deadline_assignments.sort()

difference = len(deadline_assignments) - deadline
if difference < 0:
selected_assignments = deadline_assignments
else:
selected_assignments = deadline_assignments[difference::]

tot = sum(selected_assignments)
new = format(tot, ".4f")
print(new)

main()


Solution 2: In this solution I work directly,

I create a 2-dimensional list and sort in reverse order so that the weights are first.

Then I iterate through this list and look if it's possible to complete the current assignment by the deadline. I do this by creating a list of all the days and removing each deadline I already completed.

I estimate the complexity to be around O(nlogn + nd).



def main():
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())

n = int(next(my_lines))

def take_second(elem):
return elem[1]

biggest = 0
deadlines_weights_list = []
for i in range(n):
d, w = [float(x) for x in next(my_lines).split()]
d = int(d)
if d > biggest:
biggest = d
deadlines_weights_list.append([d, w])

deadlines_weights_list.sort(key=take_second, reverse=True)

possible_days = [day+1 for day in range(biggest)]

total = 0
for deadline, weight in deadlines_weights_list:
# If there are no days left the code should be terminated
if len(possible_days) == 0:
break

while deadline not in possible_days:
deadline -= 1
# This means the only days left are really high, with much more time.
if deadline < possible_days[0]:
break

if deadline in possible_days:
# One day cannot be used twice
possible_days.remove(deadline)
total += weight


total = format(total, ".4f")
print(total)

main()


I apologize for posting two codes in one review, I'm more concerned about finding an algorithm that is fast enough for all of test cases rather than comparing this codes. Both solutions work for 7 test cases out of 10 but exceed the time limit for the other 3. Would appreciate any suggestions to optimize this solutions or a completely new way to solve this challenge.










share|improve this question









New contributor




Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    Please include the problem description as text not an image.
    $endgroup$
    – Peilonrayz
    yesterday










  • $begingroup$
    Okay I edited it
    $endgroup$
    – Michael exe
    yesterday










  • $begingroup$
    Welcome to Code Review! two codes in one review if you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).
    $endgroup$
    – greybeard
    yesterday










  • $begingroup$
    Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
    $endgroup$
    – Graipher
    16 hours ago












  • $begingroup$
    yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
    $endgroup$
    – Michael exe
    12 hours ago














2












2








2





$begingroup$


I'm trying to solve ECOO 2018 round 2(regionals) question 2 to prepare for the upcoming contest. Basically, in the question George has to achieve the maximum possible grade by completing the right assignments. Each assignment has a deadline and a weight value. I've came up with two solutions yet both only work for 7 of the cases and are extremely slow for the other 3. Here is the problem statement:




Problem 2: Homework





George has procrastinated too much on his N homework assignments,
and now he is running out of time to finish them all.

Each of George’s N assignments has a weight that it contributes
to his grade and a deadline in days from today.
George will need one day to finish any of the assignments and he must
complete an assignment before it’s deadline in order to submit it
(he can’t complete it the day an assignment is due).

Help George figure out
the order in which he should complete his assignments
such that the total weight of the assignments he completes is maximized.



Input Specifications

DATA21.txt (DATA22.txt for the second try) will contain 10 datasets.
Each dataset begins with an integer N (1 ≤ N ≤ 1,000,000).



The next N lines contain an integer D and decimal W
(1 ≤ D ≤ 1,000,000; 0 < W ≤ 100), representing
an assignment that has a deadline that is D days from today
and a weight of W.



For the first seven cases, N ≤ 1000.



Output Specifications

For each dataset, output the maximum total weight of
the assignments that George can complete, rounded to 4 decimal places
(George is very meticulous about his grade).



Sample Input (Two Datasets Shown) Sample Output



3                                   3.0000
1 1.0 17.0000
2 1.0
3 1.0
5
1 2.0
1 1.0
3 3.0
7 10.0
3 2.0



(pixel raster of original Problem Statement (including it’s))



Solution #1:
In this solution I adopted the elimination approach.

I store the deadline along with the assignments due in this day in a dictionary.

Then I sort all the assignments (keys) and iterate sequentially, each time picking the highest d assignments where d is the deadline (because you cannot complete more than 1 assignment in a day, 3 in 3 days and so on).

I estimated the complexity to be O(dlogd + dwlogw).



def main():
from collections import defaultdict

with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())

number_of_assignments = int(next(my_lines))

homework_dict = defaultdict(list)

for _ in range(number_of_assignments):
d, w = [float(i) for i in next(my_lines).split()]
d = int(d)
# Setting up the dictionary

homework_dict[d].append(w)


all_deadlines = list(homework_dict.keys())

all_deadlines.sort()

# Algorithm starts here
selected_assignments = []
for deadline in all_deadlines:
deadline_assignments = homework_dict[deadline]
deadline_assignments.extend(selected_assignments)
deadline_assignments.sort()

difference = len(deadline_assignments) - deadline
if difference < 0:
selected_assignments = deadline_assignments
else:
selected_assignments = deadline_assignments[difference::]

tot = sum(selected_assignments)
new = format(tot, ".4f")
print(new)

main()


Solution 2: In this solution I work directly,

I create a 2-dimensional list and sort in reverse order so that the weights are first.

Then I iterate through this list and look if it's possible to complete the current assignment by the deadline. I do this by creating a list of all the days and removing each deadline I already completed.

I estimate the complexity to be around O(nlogn + nd).



def main():
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())

n = int(next(my_lines))

def take_second(elem):
return elem[1]

biggest = 0
deadlines_weights_list = []
for i in range(n):
d, w = [float(x) for x in next(my_lines).split()]
d = int(d)
if d > biggest:
biggest = d
deadlines_weights_list.append([d, w])

deadlines_weights_list.sort(key=take_second, reverse=True)

possible_days = [day+1 for day in range(biggest)]

total = 0
for deadline, weight in deadlines_weights_list:
# If there are no days left the code should be terminated
if len(possible_days) == 0:
break

while deadline not in possible_days:
deadline -= 1
# This means the only days left are really high, with much more time.
if deadline < possible_days[0]:
break

if deadline in possible_days:
# One day cannot be used twice
possible_days.remove(deadline)
total += weight


total = format(total, ".4f")
print(total)

main()


I apologize for posting two codes in one review, I'm more concerned about finding an algorithm that is fast enough for all of test cases rather than comparing this codes. Both solutions work for 7 test cases out of 10 but exceed the time limit for the other 3. Would appreciate any suggestions to optimize this solutions or a completely new way to solve this challenge.










share|improve this question









New contributor




Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




I'm trying to solve ECOO 2018 round 2(regionals) question 2 to prepare for the upcoming contest. Basically, in the question George has to achieve the maximum possible grade by completing the right assignments. Each assignment has a deadline and a weight value. I've came up with two solutions yet both only work for 7 of the cases and are extremely slow for the other 3. Here is the problem statement:




Problem 2: Homework





George has procrastinated too much on his N homework assignments,
and now he is running out of time to finish them all.

Each of George’s N assignments has a weight that it contributes
to his grade and a deadline in days from today.
George will need one day to finish any of the assignments and he must
complete an assignment before it’s deadline in order to submit it
(he can’t complete it the day an assignment is due).

Help George figure out
the order in which he should complete his assignments
such that the total weight of the assignments he completes is maximized.



Input Specifications

DATA21.txt (DATA22.txt for the second try) will contain 10 datasets.
Each dataset begins with an integer N (1 ≤ N ≤ 1,000,000).



The next N lines contain an integer D and decimal W
(1 ≤ D ≤ 1,000,000; 0 < W ≤ 100), representing
an assignment that has a deadline that is D days from today
and a weight of W.



For the first seven cases, N ≤ 1000.



Output Specifications

For each dataset, output the maximum total weight of
the assignments that George can complete, rounded to 4 decimal places
(George is very meticulous about his grade).



Sample Input (Two Datasets Shown) Sample Output



3                                   3.0000
1 1.0 17.0000
2 1.0
3 1.0
5
1 2.0
1 1.0
3 3.0
7 10.0
3 2.0



(pixel raster of original Problem Statement (including it’s))



Solution #1:
In this solution I adopted the elimination approach.

I store the deadline along with the assignments due in this day in a dictionary.

Then I sort all the assignments (keys) and iterate sequentially, each time picking the highest d assignments where d is the deadline (because you cannot complete more than 1 assignment in a day, 3 in 3 days and so on).

I estimated the complexity to be O(dlogd + dwlogw).



def main():
from collections import defaultdict

with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())

number_of_assignments = int(next(my_lines))

homework_dict = defaultdict(list)

for _ in range(number_of_assignments):
d, w = [float(i) for i in next(my_lines).split()]
d = int(d)
# Setting up the dictionary

homework_dict[d].append(w)


all_deadlines = list(homework_dict.keys())

all_deadlines.sort()

# Algorithm starts here
selected_assignments = []
for deadline in all_deadlines:
deadline_assignments = homework_dict[deadline]
deadline_assignments.extend(selected_assignments)
deadline_assignments.sort()

difference = len(deadline_assignments) - deadline
if difference < 0:
selected_assignments = deadline_assignments
else:
selected_assignments = deadline_assignments[difference::]

tot = sum(selected_assignments)
new = format(tot, ".4f")
print(new)

main()


Solution 2: In this solution I work directly,

I create a 2-dimensional list and sort in reverse order so that the weights are first.

Then I iterate through this list and look if it's possible to complete the current assignment by the deadline. I do this by creating a list of all the days and removing each deadline I already completed.

I estimate the complexity to be around O(nlogn + nd).



def main():
with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())

n = int(next(my_lines))

def take_second(elem):
return elem[1]

biggest = 0
deadlines_weights_list = []
for i in range(n):
d, w = [float(x) for x in next(my_lines).split()]
d = int(d)
if d > biggest:
biggest = d
deadlines_weights_list.append([d, w])

deadlines_weights_list.sort(key=take_second, reverse=True)

possible_days = [day+1 for day in range(biggest)]

total = 0
for deadline, weight in deadlines_weights_list:
# If there are no days left the code should be terminated
if len(possible_days) == 0:
break

while deadline not in possible_days:
deadline -= 1
# This means the only days left are really high, with much more time.
if deadline < possible_days[0]:
break

if deadline in possible_days:
# One day cannot be used twice
possible_days.remove(deadline)
total += weight


total = format(total, ".4f")
print(total)

main()


I apologize for posting two codes in one review, I'm more concerned about finding an algorithm that is fast enough for all of test cases rather than comparing this codes. Both solutions work for 7 test cases out of 10 but exceed the time limit for the other 3. Would appreciate any suggestions to optimize this solutions or a completely new way to solve this challenge.







python sorting time-limit-exceeded dynamic-programming






share|improve this question









New contributor




Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 12 hours ago







Michael exe













New contributor




Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked yesterday









Michael exeMichael exe

112




112




New contributor




Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Michael exe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • $begingroup$
    Please include the problem description as text not an image.
    $endgroup$
    – Peilonrayz
    yesterday










  • $begingroup$
    Okay I edited it
    $endgroup$
    – Michael exe
    yesterday










  • $begingroup$
    Welcome to Code Review! two codes in one review if you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).
    $endgroup$
    – greybeard
    yesterday










  • $begingroup$
    Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
    $endgroup$
    – Graipher
    16 hours ago












  • $begingroup$
    yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
    $endgroup$
    – Michael exe
    12 hours ago


















  • $begingroup$
    Please include the problem description as text not an image.
    $endgroup$
    – Peilonrayz
    yesterday










  • $begingroup$
    Okay I edited it
    $endgroup$
    – Michael exe
    yesterday










  • $begingroup$
    Welcome to Code Review! two codes in one review if you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).
    $endgroup$
    – greybeard
    yesterday










  • $begingroup$
    Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
    $endgroup$
    – Graipher
    16 hours ago












  • $begingroup$
    yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
    $endgroup$
    – Michael exe
    12 hours ago
















$begingroup$
Please include the problem description as text not an image.
$endgroup$
– Peilonrayz
yesterday




$begingroup$
Please include the problem description as text not an image.
$endgroup$
– Peilonrayz
yesterday












$begingroup$
Okay I edited it
$endgroup$
– Michael exe
yesterday




$begingroup$
Okay I edited it
$endgroup$
– Michael exe
yesterday












$begingroup$
Welcome to Code Review! two codes in one review if you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).
$endgroup$
– greybeard
yesterday




$begingroup$
Welcome to Code Review! two codes in one review if you want a Comparative Review, please tag comparative-review (doesn't hurt to explicitly ask for one in the question, too).
$endgroup$
– greybeard
yesterday












$begingroup$
Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
$endgroup$
– Graipher
16 hours ago






$begingroup$
Do both solutions correctly solve the problem? If not, this would be off-topic, see our help center. If they merely exceed the time limits that is OK, we even have a tag for that: time-limit-exceeded.
$endgroup$
– Graipher
16 hours ago














$begingroup$
yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
$endgroup$
– Michael exe
12 hours ago




$begingroup$
yes they both solve the problem correctly with 7 out of 10 test cases it's just that for the other 3 test cases they are not efficient enough (takes a couple minutes to execute in compare to 1 second for all the first 7)
$endgroup$
– Michael exe
12 hours ago










1 Answer
1






active

oldest

votes


















2












$begingroup$

Approach 1




def main():
... monolithic method ...

main()



Either this is a throwaway program, in which case you don't need to define main at all, or it isn't, in which case you should only call main() if __name__ == "__main__".






    from collections import defaultdict

with open("DATA21.txt") as all_data:
my_lines = iter(all_data.readlines())

number_of_assignments = int(next(my_lines))

homework_dict = defaultdict(list)

for _ in range(number_of_assignments):
d, w = [float(i) for i in next(my_lines).split()]
d = int(d)
# Setting up the dictionary

homework_dict[d].append(w)



This looks to me like a method which should be factored out, so that there is clear isolation between the I/O and the processing.



I don't understand the use of float and int. Would it not be clearer to avoid the double-coercion with something like this?



        d, w = next(my_lines).split()
homework_dict[int(d)].append(float(w))





    all_deadlines = list(homework_dict.keys())

all_deadlines.sort()



I'm not sure what all_ adds to the name. deadlines works for me, or distinct_deadlines to be more explicit.






    # Algorithm starts here



See my previous point about factoring out methods.






    selected_assignments = []
for deadline in all_deadlines:
deadline_assignments = homework_dict[deadline]
deadline_assignments.extend(selected_assignments)
deadline_assignments.sort()



This is going to be doing a lot of sorts of already sorted data. That seems like a plausible bottleneck. I see at least three obviously better approaches:




  1. Sort homework_dict[deadline] and then merge two sorted lists in linear time.

  2. Use a priority queue of some kind for selected_assignments. Add homework_dict[deadline] to the queue and then pop it down to the desired length.

  3. The asymptotically best option that I can see, although possibly not the fastest in practice, would be to append the new assignments and then use a linear time median finder to slice the merged list.



        difference = len(deadline_assignments) - deadline
if difference < 0:
selected_assignments = deadline_assignments
else:
selected_assignments = deadline_assignments[difference::]



difference is not an informative name. What does the difference actually mean? It's something like "number of assignments to discard". I might use excess as a name, or num_excess_assignments if I wanted to be more explicit.





It would be nice to see some comments in the code explaining why it's correct. I would say that having read it carefully a few times I find its correctness plausible, but I wouldn't be willing to guarantee it.





Approach 2



A lot of the comments (about structure, names, proof of correctness, ...) on approach 1 apply here too.






    possible_days = [day+1 for day in range(biggest)]



What is possible_days used for? in possible_days and possible_days.remove. This is a classic performance blunder (so don't feel bad, because you're far from the first person to stumble into it: clearly there are popular learning resources which should address this and don't). In general, the appropriate data structure for this use case is set, not list.






        while deadline not in possible_days:
deadline -= 1
# This means the only days left are really high, with much more time.
if deadline < possible_days[0]:
break



However, it turns out that the use of possible_days is even more specialised than it seemed at first. Even if you replace the list with a set, this would still be a linear scan, and it doesn't need to be. You want the largest value smaller than a given one: if you keep the values in order then a binary chop will get there in logarithmic time rather than linear.




        if deadline in possible_days:
# One day cannot be used twice
possible_days.remove(deadline)
total += weight



However, updating the array needed for binary chop would not be logarithmic time. What you need is some kind of tree which allows searches and removals in logarithmic time. The access pattern is not going to be random, so to guarantee logarithmic time it probably needs to be a self-balancing tree, or some kind of self-pruning trie. I'm not aware of a suitable implementation in the standard library: I think you would have to write it yourself.



Actually, thinking about it some more, I reckon you could use an augmented union-find data structure to get amortised linear time. Initially each index (and -1 as a sentinel) is in a singleton set with associated data equal to the index. The search and delete becomes (pseudocode):



available = find(deadline).data
if available >= 0:
total += weight
union(available, available - 1)


ensuring that union chooses the smaller of the two associated data items.






share|improve this answer











$endgroup$













    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "196"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });






    Michael exe is a new contributor. Be nice, and check out our Code of Conduct.










    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f216115%2ffinding-maximum-homework-grade-ecoo-2018%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2












    $begingroup$

    Approach 1




    def main():
    ... monolithic method ...

    main()



    Either this is a throwaway program, in which case you don't need to define main at all, or it isn't, in which case you should only call main() if __name__ == "__main__".






        from collections import defaultdict

    with open("DATA21.txt") as all_data:
    my_lines = iter(all_data.readlines())

    number_of_assignments = int(next(my_lines))

    homework_dict = defaultdict(list)

    for _ in range(number_of_assignments):
    d, w = [float(i) for i in next(my_lines).split()]
    d = int(d)
    # Setting up the dictionary

    homework_dict[d].append(w)



    This looks to me like a method which should be factored out, so that there is clear isolation between the I/O and the processing.



    I don't understand the use of float and int. Would it not be clearer to avoid the double-coercion with something like this?



            d, w = next(my_lines).split()
    homework_dict[int(d)].append(float(w))





        all_deadlines = list(homework_dict.keys())

    all_deadlines.sort()



    I'm not sure what all_ adds to the name. deadlines works for me, or distinct_deadlines to be more explicit.






        # Algorithm starts here



    See my previous point about factoring out methods.






        selected_assignments = []
    for deadline in all_deadlines:
    deadline_assignments = homework_dict[deadline]
    deadline_assignments.extend(selected_assignments)
    deadline_assignments.sort()



    This is going to be doing a lot of sorts of already sorted data. That seems like a plausible bottleneck. I see at least three obviously better approaches:




    1. Sort homework_dict[deadline] and then merge two sorted lists in linear time.

    2. Use a priority queue of some kind for selected_assignments. Add homework_dict[deadline] to the queue and then pop it down to the desired length.

    3. The asymptotically best option that I can see, although possibly not the fastest in practice, would be to append the new assignments and then use a linear time median finder to slice the merged list.



            difference = len(deadline_assignments) - deadline
    if difference < 0:
    selected_assignments = deadline_assignments
    else:
    selected_assignments = deadline_assignments[difference::]



    difference is not an informative name. What does the difference actually mean? It's something like "number of assignments to discard". I might use excess as a name, or num_excess_assignments if I wanted to be more explicit.





    It would be nice to see some comments in the code explaining why it's correct. I would say that having read it carefully a few times I find its correctness plausible, but I wouldn't be willing to guarantee it.





    Approach 2



    A lot of the comments (about structure, names, proof of correctness, ...) on approach 1 apply here too.






        possible_days = [day+1 for day in range(biggest)]



    What is possible_days used for? in possible_days and possible_days.remove. This is a classic performance blunder (so don't feel bad, because you're far from the first person to stumble into it: clearly there are popular learning resources which should address this and don't). In general, the appropriate data structure for this use case is set, not list.






            while deadline not in possible_days:
    deadline -= 1
    # This means the only days left are really high, with much more time.
    if deadline < possible_days[0]:
    break



    However, it turns out that the use of possible_days is even more specialised than it seemed at first. Even if you replace the list with a set, this would still be a linear scan, and it doesn't need to be. You want the largest value smaller than a given one: if you keep the values in order then a binary chop will get there in logarithmic time rather than linear.




            if deadline in possible_days:
    # One day cannot be used twice
    possible_days.remove(deadline)
    total += weight



    However, updating the array needed for binary chop would not be logarithmic time. What you need is some kind of tree which allows searches and removals in logarithmic time. The access pattern is not going to be random, so to guarantee logarithmic time it probably needs to be a self-balancing tree, or some kind of self-pruning trie. I'm not aware of a suitable implementation in the standard library: I think you would have to write it yourself.



    Actually, thinking about it some more, I reckon you could use an augmented union-find data structure to get amortised linear time. Initially each index (and -1 as a sentinel) is in a singleton set with associated data equal to the index. The search and delete becomes (pseudocode):



    available = find(deadline).data
    if available >= 0:
    total += weight
    union(available, available - 1)


    ensuring that union chooses the smaller of the two associated data items.






    share|improve this answer











    $endgroup$


















      2












      $begingroup$

      Approach 1




      def main():
      ... monolithic method ...

      main()



      Either this is a throwaway program, in which case you don't need to define main at all, or it isn't, in which case you should only call main() if __name__ == "__main__".






          from collections import defaultdict

      with open("DATA21.txt") as all_data:
      my_lines = iter(all_data.readlines())

      number_of_assignments = int(next(my_lines))

      homework_dict = defaultdict(list)

      for _ in range(number_of_assignments):
      d, w = [float(i) for i in next(my_lines).split()]
      d = int(d)
      # Setting up the dictionary

      homework_dict[d].append(w)



      This looks to me like a method which should be factored out, so that there is clear isolation between the I/O and the processing.



      I don't understand the use of float and int. Would it not be clearer to avoid the double-coercion with something like this?



              d, w = next(my_lines).split()
      homework_dict[int(d)].append(float(w))





          all_deadlines = list(homework_dict.keys())

      all_deadlines.sort()



      I'm not sure what all_ adds to the name. deadlines works for me, or distinct_deadlines to be more explicit.






          # Algorithm starts here



      See my previous point about factoring out methods.






          selected_assignments = []
      for deadline in all_deadlines:
      deadline_assignments = homework_dict[deadline]
      deadline_assignments.extend(selected_assignments)
      deadline_assignments.sort()



      This is going to be doing a lot of sorts of already sorted data. That seems like a plausible bottleneck. I see at least three obviously better approaches:




      1. Sort homework_dict[deadline] and then merge two sorted lists in linear time.

      2. Use a priority queue of some kind for selected_assignments. Add homework_dict[deadline] to the queue and then pop it down to the desired length.

      3. The asymptotically best option that I can see, although possibly not the fastest in practice, would be to append the new assignments and then use a linear time median finder to slice the merged list.



              difference = len(deadline_assignments) - deadline
      if difference < 0:
      selected_assignments = deadline_assignments
      else:
      selected_assignments = deadline_assignments[difference::]



      difference is not an informative name. What does the difference actually mean? It's something like "number of assignments to discard". I might use excess as a name, or num_excess_assignments if I wanted to be more explicit.





      It would be nice to see some comments in the code explaining why it's correct. I would say that having read it carefully a few times I find its correctness plausible, but I wouldn't be willing to guarantee it.





      Approach 2



      A lot of the comments (about structure, names, proof of correctness, ...) on approach 1 apply here too.






          possible_days = [day+1 for day in range(biggest)]



      What is possible_days used for? in possible_days and possible_days.remove. This is a classic performance blunder (so don't feel bad, because you're far from the first person to stumble into it: clearly there are popular learning resources which should address this and don't). In general, the appropriate data structure for this use case is set, not list.






              while deadline not in possible_days:
      deadline -= 1
      # This means the only days left are really high, with much more time.
      if deadline < possible_days[0]:
      break



      However, it turns out that the use of possible_days is even more specialised than it seemed at first. Even if you replace the list with a set, this would still be a linear scan, and it doesn't need to be. You want the largest value smaller than a given one: if you keep the values in order then a binary chop will get there in logarithmic time rather than linear.




              if deadline in possible_days:
      # One day cannot be used twice
      possible_days.remove(deadline)
      total += weight



      However, updating the array needed for binary chop would not be logarithmic time. What you need is some kind of tree which allows searches and removals in logarithmic time. The access pattern is not going to be random, so to guarantee logarithmic time it probably needs to be a self-balancing tree, or some kind of self-pruning trie. I'm not aware of a suitable implementation in the standard library: I think you would have to write it yourself.



      Actually, thinking about it some more, I reckon you could use an augmented union-find data structure to get amortised linear time. Initially each index (and -1 as a sentinel) is in a singleton set with associated data equal to the index. The search and delete becomes (pseudocode):



      available = find(deadline).data
      if available >= 0:
      total += weight
      union(available, available - 1)


      ensuring that union chooses the smaller of the two associated data items.






      share|improve this answer











      $endgroup$
















        2












        2








        2





        $begingroup$

        Approach 1




        def main():
        ... monolithic method ...

        main()



        Either this is a throwaway program, in which case you don't need to define main at all, or it isn't, in which case you should only call main() if __name__ == "__main__".






            from collections import defaultdict

        with open("DATA21.txt") as all_data:
        my_lines = iter(all_data.readlines())

        number_of_assignments = int(next(my_lines))

        homework_dict = defaultdict(list)

        for _ in range(number_of_assignments):
        d, w = [float(i) for i in next(my_lines).split()]
        d = int(d)
        # Setting up the dictionary

        homework_dict[d].append(w)



        This looks to me like a method which should be factored out, so that there is clear isolation between the I/O and the processing.



        I don't understand the use of float and int. Would it not be clearer to avoid the double-coercion with something like this?



                d, w = next(my_lines).split()
        homework_dict[int(d)].append(float(w))





            all_deadlines = list(homework_dict.keys())

        all_deadlines.sort()



        I'm not sure what all_ adds to the name. deadlines works for me, or distinct_deadlines to be more explicit.






            # Algorithm starts here



        See my previous point about factoring out methods.






            selected_assignments = []
        for deadline in all_deadlines:
        deadline_assignments = homework_dict[deadline]
        deadline_assignments.extend(selected_assignments)
        deadline_assignments.sort()



        This is going to be doing a lot of sorts of already sorted data. That seems like a plausible bottleneck. I see at least three obviously better approaches:




        1. Sort homework_dict[deadline] and then merge two sorted lists in linear time.

        2. Use a priority queue of some kind for selected_assignments. Add homework_dict[deadline] to the queue and then pop it down to the desired length.

        3. The asymptotically best option that I can see, although possibly not the fastest in practice, would be to append the new assignments and then use a linear time median finder to slice the merged list.



                difference = len(deadline_assignments) - deadline
        if difference < 0:
        selected_assignments = deadline_assignments
        else:
        selected_assignments = deadline_assignments[difference::]



        difference is not an informative name. What does the difference actually mean? It's something like "number of assignments to discard". I might use excess as a name, or num_excess_assignments if I wanted to be more explicit.





        It would be nice to see some comments in the code explaining why it's correct. I would say that having read it carefully a few times I find its correctness plausible, but I wouldn't be willing to guarantee it.





        Approach 2



        A lot of the comments (about structure, names, proof of correctness, ...) on approach 1 apply here too.






            possible_days = [day+1 for day in range(biggest)]



        What is possible_days used for? in possible_days and possible_days.remove. This is a classic performance blunder (so don't feel bad, because you're far from the first person to stumble into it: clearly there are popular learning resources which should address this and don't). In general, the appropriate data structure for this use case is set, not list.






                while deadline not in possible_days:
        deadline -= 1
        # This means the only days left are really high, with much more time.
        if deadline < possible_days[0]:
        break



        However, it turns out that the use of possible_days is even more specialised than it seemed at first. Even if you replace the list with a set, this would still be a linear scan, and it doesn't need to be. You want the largest value smaller than a given one: if you keep the values in order then a binary chop will get there in logarithmic time rather than linear.




                if deadline in possible_days:
        # One day cannot be used twice
        possible_days.remove(deadline)
        total += weight



        However, updating the array needed for binary chop would not be logarithmic time. What you need is some kind of tree which allows searches and removals in logarithmic time. The access pattern is not going to be random, so to guarantee logarithmic time it probably needs to be a self-balancing tree, or some kind of self-pruning trie. I'm not aware of a suitable implementation in the standard library: I think you would have to write it yourself.



        Actually, thinking about it some more, I reckon you could use an augmented union-find data structure to get amortised linear time. Initially each index (and -1 as a sentinel) is in a singleton set with associated data equal to the index. The search and delete becomes (pseudocode):



        available = find(deadline).data
        if available >= 0:
        total += weight
        union(available, available - 1)


        ensuring that union chooses the smaller of the two associated data items.






        share|improve this answer











        $endgroup$



        Approach 1




        def main():
        ... monolithic method ...

        main()



        Either this is a throwaway program, in which case you don't need to define main at all, or it isn't, in which case you should only call main() if __name__ == "__main__".






            from collections import defaultdict

        with open("DATA21.txt") as all_data:
        my_lines = iter(all_data.readlines())

        number_of_assignments = int(next(my_lines))

        homework_dict = defaultdict(list)

        for _ in range(number_of_assignments):
        d, w = [float(i) for i in next(my_lines).split()]
        d = int(d)
        # Setting up the dictionary

        homework_dict[d].append(w)



        This looks to me like a method which should be factored out, so that there is clear isolation between the I/O and the processing.



        I don't understand the use of float and int. Would it not be clearer to avoid the double-coercion with something like this?



                d, w = next(my_lines).split()
        homework_dict[int(d)].append(float(w))





            all_deadlines = list(homework_dict.keys())

        all_deadlines.sort()



        I'm not sure what all_ adds to the name. deadlines works for me, or distinct_deadlines to be more explicit.






            # Algorithm starts here



        See my previous point about factoring out methods.






            selected_assignments = []
        for deadline in all_deadlines:
        deadline_assignments = homework_dict[deadline]
        deadline_assignments.extend(selected_assignments)
        deadline_assignments.sort()



        This is going to be doing a lot of sorts of already sorted data. That seems like a plausible bottleneck. I see at least three obviously better approaches:




        1. Sort homework_dict[deadline] and then merge two sorted lists in linear time.

        2. Use a priority queue of some kind for selected_assignments. Add homework_dict[deadline] to the queue and then pop it down to the desired length.

        3. The asymptotically best option that I can see, although possibly not the fastest in practice, would be to append the new assignments and then use a linear time median finder to slice the merged list.



                difference = len(deadline_assignments) - deadline
        if difference < 0:
        selected_assignments = deadline_assignments
        else:
        selected_assignments = deadline_assignments[difference::]



        difference is not an informative name. What does the difference actually mean? It's something like "number of assignments to discard". I might use excess as a name, or num_excess_assignments if I wanted to be more explicit.





        It would be nice to see some comments in the code explaining why it's correct. I would say that having read it carefully a few times I find its correctness plausible, but I wouldn't be willing to guarantee it.





        Approach 2



        A lot of the comments (about structure, names, proof of correctness, ...) on approach 1 apply here too.






            possible_days = [day+1 for day in range(biggest)]



        What is possible_days used for? in possible_days and possible_days.remove. This is a classic performance blunder (so don't feel bad, because you're far from the first person to stumble into it: clearly there are popular learning resources which should address this and don't). In general, the appropriate data structure for this use case is set, not list.






                while deadline not in possible_days:
        deadline -= 1
        # This means the only days left are really high, with much more time.
        if deadline < possible_days[0]:
        break



        However, it turns out that the use of possible_days is even more specialised than it seemed at first. Even if you replace the list with a set, this would still be a linear scan, and it doesn't need to be. You want the largest value smaller than a given one: if you keep the values in order then a binary chop will get there in logarithmic time rather than linear.




                if deadline in possible_days:
        # One day cannot be used twice
        possible_days.remove(deadline)
        total += weight



        However, updating the array needed for binary chop would not be logarithmic time. What you need is some kind of tree which allows searches and removals in logarithmic time. The access pattern is not going to be random, so to guarantee logarithmic time it probably needs to be a self-balancing tree, or some kind of self-pruning trie. I'm not aware of a suitable implementation in the standard library: I think you would have to write it yourself.



        Actually, thinking about it some more, I reckon you could use an augmented union-find data structure to get amortised linear time. Initially each index (and -1 as a sentinel) is in a singleton set with associated data equal to the index. The search and delete becomes (pseudocode):



        available = find(deadline).data
        if available >= 0:
        total += weight
        union(available, available - 1)


        ensuring that union chooses the smaller of the two associated data items.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 6 hours ago

























        answered 9 hours ago









        Peter TaylorPeter Taylor

        18.1k2963




        18.1k2963






















            Michael exe is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded


















            Michael exe is a new contributor. Be nice, and check out our Code of Conduct.













            Michael exe is a new contributor. Be nice, and check out our Code of Conduct.












            Michael exe is a new contributor. Be nice, and check out our Code of Conduct.
















            Thanks for contributing an answer to Code Review Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f216115%2ffinding-maximum-homework-grade-ecoo-2018%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown