1

I am fairly new to optimization problems, so please forgive my lack of knowledge.

That said, I'm trying to write a program that takes an NxM matrix randomly filled with 0's and 1's, then reduces this matrix by selectively deleting rows and columns. Ultimately, I want a (N - a)x(M - b) matrix where every ijth value equals 1 should such a matrix exist.

There are two constraints: Constraint 1) N - a > c for some c > 0 (I.e., there is a minimum number of rows); Constraint 2) Using the 1st constraint, out of all possible solutions I want the most optimized. (I.e., Max[(N-a)*(M-b)] or the greatest number of 1's remaining.)

In my opinion, this is a good beginners exercise. The approach I've thought of so far is:

Step 1) for every row and column, find it's % 0's, then take the % 0's for all the columns/rows respectively that intersect this row/column with 0.

Step 2) Penalize "double" deletions of 1's (I.e., if I deleted a 1 in row 5 by removing column 16 then later had to delete row 5) by multiplying every individual row or column % with (L+s)/(M) where L is the current row/column length, s is the number of 0's previously deleted in this row/column, and M is the original row/column length.

Step 3) Further penalize row deletions by multiplying a row group average % by 1 - r/(N - c - k) where N is the original total number of rows, c is the minimum number of rows, k is the number of rows deleted thus far, and r is the number of rows pending deletion.

Step 4) Find the distance in standard deviations from each row or column % 0's to the mean % 0's of the respective column/row group described above. If there are less than 3 members of the respective group, calculate the real line distance instead.

Step 5) Delete the row/column or respective column/row group with the largest positive distance, repeat steps 1-5 until a matrix containing only 1's is obtained.

Does this method make sense? Is there a simpler or more elegant approach? Can it be done without iteration, or be solved exactly? Any advice and/or criticism is appreciated.

Curator
  • 65
  • 5
  • Sorry, there was a fairly big mistake in step 1, so I edited this to fix it. – Curator Oct 25 '14 at 22:58
  • I don't know how to answer your question, but here is a good tutorial on mathjax http://meta.math.stackexchange.com/questions/5020/mathjax-basic-tutorial-and-quick-reference . Instead of (L+s)/(M) you can write $$\frac{L+s}{M}$$ which is a lot easier to read. – Joao Oct 26 '14 at 01:15
  • Thank you Joao. I will edit this now, while also trying to correct my methodology (simulations commence!) – Curator Oct 26 '14 at 10:38

1 Answers1

0

About exact approaches: If you want to find the minimum number of row/column deletions, then this is equivalent to Vertex Cover in a bipartite graph, which can be solved by matching techniques in polynomial time. If $a$ and $b$ are fixed in the input, your problem is Constraint Bipartite Vertex Cover, which is NP-hard (see for example here). However, you have as objective to minimize the number of remaining 1s, which is slightly different from these two. Still, the techniques used for these two problems might be useful.