DNR's blog

By DNR, 5 months ago, In English

You’re given a set of jobs and integers $$$x$$$ and $$$y$$$. Each job is represented by a time interval $$$[l, r]$$$ (integral endpoints, $$$0 \leq l \lt r \leq y$$$). The goal is to find an assignment of jobs to machines (assume that we have an infinite number of identical machines) which minimizes the number of machines used, while also satisfying the following:

  • Each job is assigned to some machine.
  • No two jobs assigned to the same machine intersect (but their endpoints can touch, so $$$[a, b]$$$ and $$$[b, c]$$$ can be assigned to the same machine).
  • Every used machine has some contiguous “break interval” of length $$$x$$$, which doesn’t intersect with any job assigned to that machine (once again, endpoints can touch). In other words, there is some contiguous time period of length $$$x$$$ where no job is being executed on it.

I’m pretty sure there isn’t an exact solution in P, so I’m looking for decent approximation algorithms.

I could only think of a simple approximation algorithm with a bad approximation factor that's heavily dependent on the input data (although I'm pretty sure I could make a cursed div-2 D out of it).

The idea

There are also some pretty easy exact solutions which are exponential in the number of machines (which, unlike those exponential in the number of jobs, can occasionally be practical to run), but they're not interesting.

Are there any assumptions on the input data that would make this problem solvable in an interesting manner?

I would also be grateful if someone could direct me towards existing work on related problems if they know of it.

  • Vote: I like it
  • -17
  • Vote: I do not like it

»
5 months ago, hide # |
 
Vote: I like it +11 Vote: I do not like it

I don't understand the third bullet point, can you explain it?

  • »
    »
    5 months ago, hide # ^ |
     
    Vote: I like it +4 Vote: I do not like it

    Consider the set of jobs/intervals assigned to some machine. Then for every machine: after we assign the given intervals, it should be possible to assign an additional interval (not present in the original set) of length $$$x$$$ somewhere within the timeline $$$[0, y]$$$, such that it doesn't intersect with any of the old intervals on this machine (boundaries can touch).

    • »
      »
      »
      5 months ago, hide # ^ |
       
      Vote: I like it 0 Vote: I do not like it

      is it correct to say that for 2 jobs/intervals to be assigned into the same machine(which was empty) they have to:

      1. not intersects
      2. the different between the end of the first job and the start of the second job is >= x

      this is what I understand from 3rd point.

      • »
        »
        »
        »
        5 months ago, hide # ^ |
        Rev. 3  
        Vote: I like it +3 Vote: I do not like it

        No that's not what it means.

        Consider the following setup:

        • $$$y = 20, s = 2, x = 4$$$
        • on some machine, we assign the segments $$$[1, 3], [7, 9], [10, 12], [17, 18], [18, 20]$$$

        The assignment on this machine is valid because we can place a contiguous length-4 interval in some gap after this assignment (in this case, we can place one at $$$[3, 7], [12, 16], [13, 17]$$$). We do not need to be able to place a break between every pair of adjacent intervals, just once somewhere.

        So in this case, it doesn't matter that we cannot place a break between $$$[7, 9]$$$ and $$$[10, 12]$$$.

        • »
          »
          »
          »
          »
          5 months ago, hide # ^ |
           
          Vote: I like it 0 Vote: I do not like it

          I see, thanks.
          I thought the problem was easy, turns out I just read 3rd point wrong.

          Unfortunately I most likely won't solve your problem. Though, I will learn about the standard machine assignment problem soon.

»
5 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

How did you prove that there isn't an exact solution in P?

»
5 months ago, hide # |
Rev. 2  
Vote: I like it 0 Vote: I do not like it

If i understood your question correctly, then i have an idea. So, let us try to do binary function on the minimum no of machines. Cause if job can be done in m machines then for sure can be done with m+1, m+2 ... machines.

Now, how my check function will look? i will do something like sweep line with sorting the ranges with there l parameter also i will maintain two sets, lets name them, working and idle.

working set will store end pos, machine id of the machine which are currently working.

idle set will store three info -> GotBreak, last ended job pos, machine id.

ok so GotBreak is a (0,1) which tells whether the machine got that x length break or not, last job ended position is just the endpoint of job which the machine did and lastly the machine id.

by updating GotBreak of every idle machine(i.e. when we reach a point where its last job done is atleast x position behind) and these sets. Now the last part left is how to assign the job to a machine. i say that the last element of idle set is best why?

so first if you have a machine which already have break nothing issue. now if no machine have break so assigning the job to machine which has finished its job latest is optimal ig?

and if set is empty then for sure return false, and at end if everyone got break then return true. I am not able to find whats incorrect in this approach. compression and all i can get this in nlog^2n ig maybe.

  • »
    »
    5 months ago, hide # ^ |
     
    Vote: I like it +3 Vote: I do not like it

    I am not able to find whats incorrect in this approach.

    I, on the other hand, am not able to find what’s correct in this approach.

    • »
      »
      »
      5 months ago, hide # ^ |
       
      Vote: I like it 0 Vote: I do not like it

      umm, can you explain what?

      • »
        »
        »
        »
        5 months ago, hide # ^ |
         
        Vote: I like it -8 Vote: I do not like it

        can you explain your idea through code or at least pseudo-code? in its current (extremely ambiguous) form, it just feels incorrect at worst and doubtful at best

        • »
          »
          »
          »
          »
          5 months ago, hide # ^ |
           
          Vote: I like it 0 Vote: I do not like it
          bool check(ll machines, ll x, ll y, vector<pair<ll, ll>> &sortedRanges) {
              multiset<pair<ll, ll>> idle; // <gotBreak, lastEndedJob>
              multiset<pair<ll, ll>> working; // <jobEndingPos, gotBreak>
          
              for (int i = 0; i < machines; i++) idle.insert({0, 0});
              ll job = 0;
              for (int pos = 0; pos <= y; pos++) {
                  while (!working.empty() && working.begin()->first == pos) { // job which ends at current pos
                      auto it = working.begin();
                      idle.insert({it->second, pos});
                      working.erase(it);
                  }
                  while (!idle.empty() && idle.begin()->first == 0 && idle.begin()->second + x <= pos) { // those who are idle for more than x time, update that they got break
                      auto it = idle.begin();
                      ll endPos = it->second;
                      idle.erase(it);
                      idle.insert({1, endPos});
                  }
                  while (job < sortedRanges.size() && sortedRanges[job].first == pos) { // starting jobs which starts at current pos
                      if (idle.empty()) return false;
                      auto it = --idle.end();// taking last is optimal ig
                      working.insert({sortedRanges[job].second, it->first});
                      idle.erase(it);
                      job++;
                  }
              }
              if (idle.begin()->first == 0) return false;
              return true;
          }
          

          i will do binary search on machines. code might be buggy but idea is more or less same.

          • »
            »
            »
            »
            »
            »
            5 months ago, hide # ^ |
             
            Vote: I like it 0 Vote: I do not like it

            This is obviously incorrect. Consider $$$x = 1, y = 3$$$, with intervals $$$[0, 1], [1, 2], [1, 3]$$$.

            We can use 2 machines in the following way:

            • Assign intervals $$$[0, 1]$$$ and $$$[1, 2]$$$ to the first machine. It has a break on $$$[2, 3]$$$.
            • Assign intervals $$$[1, 3]$$$ to the second machine. It has a break on $$$[0, 1]$$$.

            Your algorithm does not find this configuration, and requires 3 machines instead.

            • »
              »
              »
              »
              »
              »
              »
              5 months ago, hide # ^ |
               
              Vote: I like it 0 Vote: I do not like it

              ohh right right, got my mistake thankss. will think over it

            • »
              »
              »
              »
              »
              »
              »
              5 months ago, hide # ^ |
               
              Vote: I like it 0 Vote: I do not like it

              can you provide few test cases so that i can test if i get some logic?

»
4 months ago, hide # |
 
Vote: I like it +20 Vote: I do not like it

I believe there is a trivial 2-approximate algorithm (btw, I'm not sure that this problem is NP-hard, do you have any proof?).

Solve the problem with $$$x = 0$$$ exactly, then consider each machine in the optimal solution. I claim that each task $$$[l, r]$$$ is such that either $$$r \le y - x$$$ or $$$l \ge x$$$, otherwise $$$[l, r]$$$ can't even be placed alone on one machine, and the answer is NO. Now consider the optimal answer for $$$x = 0$$$ and split each machine into two: the first will have tasks such that $$$r \le y - x$$$ (break interval $$$[y - x, y]$$$), and the second will have tasks such that $$$l \ge x$$$ (break interval $$$[0, x]$$$). This way we have $$$ans \le 2opt_{x = 0} \le 2opt$$$.

  • »
    »
    4 months ago, hide # ^ |
     
    Vote: I like it -54 Vote: I do not like it

    obvious and worse than the one in the blog

    • »
      »
      »
      4 months ago, hide # ^ |
       
      Vote: I like it +60 Vote: I do not like it

      I don't understand what's the point of creating a blog to ask people to solve an open problem for you and respond comments with "obvious", "extremely ambiguous".

      Dude... CP community is sometimes the worst.

      • »
        »
        »
        »
        4 months ago, hide # ^ |
         
        Vote: I like it -18 Vote: I do not like it
        1. I didn’t ask anyone to solve an open problem, I asked for help with approximation algorithms.
        2. If the commenter I replied to had actually read the algorithm I described, he wouldn’t have wasted time describing the 2-approximation.
        3. If you had actually read the comment to which I charitably replied with “extremely ambiguous”, you wouldn’t have wasted time writing your comment.
  • »
    »
    4 months ago, hide # ^ |
    Rev. 2  
    Vote: I like it 0 Vote: I do not like it

    It is NP-hard because Job Scheduling Problem is NP-hard and infact this problem is a more general form of Job Scheduling problem, so if one can solve this is polynomial time then the former won't be a NP-hard.

    The problem can be solved using integer linear programming and applying some traditional ILP solvers.

    • »
      »
      »
      4 months ago, hide # ^ |
       
      Vote: I like it 0 Vote: I do not like it

      It is NP-hard because Job Scheduling Problem is NP-hard

      this problem is a more general form of Job Scheduling problem

      Not really, job scheduling usually assumes that job start times are not fixed

»
4 months ago, hide # |
 
Vote: I like it +8 Vote: I do not like it

Isn't it not guaranteed that a solution even exists? If some job's time interval doesn't allow a break interval even if it's the only one on its machine, for example.