tag:blogger.com,1999:blog-15159421525597543682017-02-07T14:56:53.115-08:00uxcnJason Schulznoreply@blogger.comBlogger5125tag:blogger.com,1999:blog-1515942152559754368.post-6003046350084085162016-08-25T12:30:00.002-07:002016-08-26T05:12:21.767-07:00Sh-Utils<p dir="ltr">I wrote a set of simple shell commands a little while back. I actually ended up using them pretty regularly (quick and dirty source control, manual merging of configuration files, editing large compressed and delimited files in place, etc...). Since they save some non-trivial keystrokes, I cleaned up the code and pushed it to <a href="https://github.com/uxcn/sh-utils">github</a>. Likewise, I built a python package and published it via <a href="https://pypi.python.org/pypi/sh-utils">PyPI</a>.</p><p dir="ltr">The package is called <a href="https://pypi.python.org/pypi/sh-utils" style="font-family: sans-serif;">sh-utils</a>. The commands are...</p> pm - move p to p'<br>cpm - copy p to p'<br>upm - undo p to p'<br>sw - swap two paths<br>pt - pivot file over a command<br>pts - pivot file over a command (stdin)<br> <p dir="ltr">There are descriptions on the github and PyPI pages, but I'll try to go into a bit more detail.</p><p dir="ltr">The three <i>pm</i> commands are mostly meant to work with path (file, directory, pipe, socket, etc...) backups. For example, it's fairly reasonable to move or copy a file to a <i>.bak</i> (<i>.old</i>, <i>.orig</i>, etc...). Vice versa, it's fairly reasonable to want to undo it. So, <i>pm</i> takes a list of paths and moves them safely to a backup by prefixing (<i>-p</i>) or suffixing (<i>-s</i>) a value to the path name. Likewise, <i>cpm</i> does the same, but copies the path instead of moving it. <i>upm</i> safely undoes both of them.</p><p dir="ltr">For example, making a backup before editing a file...</p><pre class="prettyprint lang-sh">~ cpm foo<br /></pre><p dir="ltr">To recover it...</p><pre class="prettyprint lang-sh">~ upm foo\'<br /></pre><p dir="ltr">They're additive and subtractive as well, so it's easy to keep a stack of path(s) backups. Unfortunately, that can make things a bit messy and possibly painful. So, instead of having to run <i>upm</i> a bunch of times to unwind an stack of backups, <i>upm</i> has a flag (<i>-a</i>) that conflates the latest and earliest paths in a single command.</p><p dir="ltr">The next command is <i>sw</i>, which swaps two paths. Normally swapping two paths takes a few commands and a pivot file (move a path to a pivot, move second to first, move the pivot to the second). Instead, <i>sw</i> does everything in one command and as close to atomically as possible.</p><p dir="ltr">For example, swapping a file with its backup...</p><pre class="prettyprint lang-sh">~ sw foo foo\'<br /></pre><p dir="ltr">or...</p><pre class="prettyprint lang-sh">~ sw foo* <br /></pre><p dir="ltr">Last are two pivot commands. Similar to <i>sw</i>, doing something in place on a file normally needs more than one step and a pivot file. Instead, <i>pt</i> and <i>pts</i> do everything in one command and as close to atomically as possible.</p><p dir="ltr">Normally I would sort a file...</p><pre class="prettyprint lang-sh">~ sort foo > foo\'<br />~ mv foo\' foo<br /></pre><p dir="ltr">With <i>pt</i>...</p><pre class="prettyprint lang-sh">~ pt sort foo<br /></pre><p dir="ltr">Finally, <i>pts</i> is similar to <i>pt</i> except it passes the file on <i>stdin</i>, which can be useful for pipelines.</p><p dir="ltr">For example, if you wanted to uncompress a delimited file, sort it, parse it, square a column, and re-compress it, normally it would be...</p><pre class="prettyprint lang-sh">~ xz -dc foo.xz | sort | awk '{ print \$1, \$2 * \$2 }' | xz > foo.xz\'<br />~ mv foo.xz\' foo.xz<br /></pre><p dir="ltr">Instead, with <i>pts</i>...</p><pre class="prettyprint lang-sh">~ pts sh -c "xz -d | sort | awk '{ print \$1, \$2 * \$2 }' | xz" foo.xz<br /></pre><p dir="ltr">If anyone ends up using these and you have any questions, feel free to ask.</p>Jason Schulzhttps://plus.google.com/107454831271192954986noreply@blogger.com0tag:blogger.com,1999:blog-1515942152559754368.post-49328663465913081962014-06-03T22:23:00.000-07:002016-08-25T10:29:45.036-07:00Hypertext BrowsingI tend to browse the web using Firefox with some basic plugins, which probably is true for most people (Chrome, Safari, etc...). It's overkill for a lot of things though. A lot of the time, all I really need is a way to enter a URL, render some text, search for some text, follow links, and possibly enter text into forms. All of these can be done with a slightly simpler browser, <a href="http://w3m.sourceforge.net/">w3m</a>. <br /><br />If you are resource constrained, you prefer working strictly with text, or you don't have access to a graphics environment, w3m can come in handy. The footprint is a few orders of magnitude less than the graphical variants, and it's meant for keyboard navigation.<br /><br />The motions are probably familiar to anyone who's used Vi/Vim before.<br /><br /><table cellpadding="0" cellspacing="0"><tbody><tr><td><i><b><h></b></i></td><td>- left</td></tr><tr><td><i><b><j></b></i></td><td>- down</td></tr><tr><td><i><b><k></b></i></td><td>- up</td></tr><tr><td><i><b><l></b></i></td><td>- right</td></tr><tr><td><i><b><w></b></i></td><td>- next word</td></tr><tr><td><i><b><^></b></i></td><td>- line start</td></tr><tr><td><i><b><$></b></i></td><td>- line end</td></tr></tbody></table><br //>Directional keys can also be used. Navigating links and input fields is generally the same as the graphical browsers.<br /><br /><table cellpadding="0" cellspacing="0"><tbody><tr><td><i><b><tab></b></i></td><td>- next field</td></tr><tr><td><i><b><s-tab></b></i></td><td>- previous field</td></tr></tbody></table><br />Instead of the mouse, w3m gives a menu for links, which can be slightly faster than using a mouse (see <a href="https://en.wikipedia.org/wiki/Fitts%27s_law">Fitts' Law</a>).<br /><br /><table cellpadding="0" cellspacing="0"><tbody><tr><td><i><b><esc-m></b></i></td><td>- link list (move)</td></tr><tr><td><i><b><esc-l></b></i></td><td>- link list (follow)</td></tr></tbody></table><br />Another easy way to navigate is searching for text. Again, Vi/Vim users will probably recognize the default key bindings.<br /><br /><table cellpadding="0" cellspacing="0"><tbody><tr><td><i><b></></b></i></td><td>- search forward</td></tr><tr><td><i><b><?></b></i></td><td>- search backward</td></tr></tbody></table><br />Last, to open new URLs and tabs...<br /><br /><table cellpadding="0" cellspacing="0"><tbody><tr><td><i><b><U></b></i></td><td>- open URL</td></tr><tr><td><i><b><T></b></i></td><td>- new tab</td></tr></tbody></table><br />There are more complex combinations, but the above should be a good start. The full set of navigation and configuration options can be found in the <a href="http://w3m.sourceforge.net/MANUAL">man page</a> as well as the help screen (<b><i><H></i></b>).<br /><br />Putting it all together...<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-zN_VmFQgRO0/U4wSTpHFNmI/AAAAAAAACv4/5p_A4mdykeM/s1600/w3m-google.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="176" src="https://2.bp.blogspot.com/-zN_VmFQgRO0/U4wSTpHFNmI/AAAAAAAACv4/5p_A4mdykeM/s1600/w3m-google.png" width="320"></a></div><br />One of the things I've found w3m is good for is as a basic interface for things that have a form of HTTP/HTML API.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-8NGdPHcr6qQ/U46sTpAFHrI/AAAAAAAACwg/aNDtdd1lpr8/s1600/w3m-surfboard.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="176" src="https://1.bp.blogspot.com/-8NGdPHcr6qQ/U46sTpAFHrI/AAAAAAAACwg/aNDtdd1lpr8/s1600/w3m-surfboard.png" width="320"></a></div><br />It's admittedly limited, depending on the protocol or features sites you try to browse rely on. Support for more recent protocols (SPDY, HTML5, AJAX, etc...) is decidedly lacking.<br /><br />Although, on the development side, it's also useful for a number of things like navigating doxygen/javadoc, parsing jhat output, and even browsing wikipedia.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-jnUrm9aw45o/U41UutQTUqI/AAAAAAAACwQ/fllL2k4_89I/s1600/w3m-boost.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="176" src="https://2.bp.blogspot.com/-jnUrm9aw45o/U41UutQTUqI/AAAAAAAACwQ/fllL2k4_89I/s1600/w3m-boost.png" width="320"></a></div><br />It's also useful to work with from the command line.<br /><br /><pre class="prettyprint lang-sh">~ watch -n 10 w3m http://www.wrh.noaa.gov/sew -dump</pre><br />I combine it with other things like <a href="https://tmux.github.io/">tmux</a>, <a href="http://www.mutt.org/">mutt</a>, <a href="http://www.vim.org/">vim</a> and <a href="https://www.gnu.org/software/emacs/">emacs</a> which makes it a little more useful, but the above are some of the basics.<br /><br />Downloads, documentation, etc... can be found on w3m's <a href="http://w3m.sourceforge.net/">homepage</a> for anyone curious. Jason Schulzhttps://plus.google.com/107454831271192954986noreply@blogger.com0tag:blogger.com,1999:blog-1515942152559754368.post-85874594747960368052014-02-15T11:08:00.000-08:002016-08-18T12:56:33.057-07:00Dynamic ProgrammingI was going through another <a href="http://projecteuler.net/">project euler</a> problem, and the solution ended up being a example of dynamic programming. I've gone through project euler <a href="https://uxcn.blogspot.com/2011/11/algorithm-complexity.html">solutions</a> in the past, so I thought it might be interesting to walk through another.<br /><br />In short, <a href="https://en.wikipedia.org/wiki/Dynamic_programming">dynamic programming</a> is a way of using cached information to optimize choice in a recursive problem. By eliminating the non-optimal choices, it reduces the complexity (possibly exponentially). It's actually a bit of an understated concept, since a lot of problems can naturally be expressed with recursion. A lot of the problems are also generally not feasible without it. Obviously the larger the problem, the larger the run-time improvement.<br /><br />The specific <a href="http://projecteuler.net/problem=67">problem</a> has to do with optimally traversing a graph. Since the problem was simple enough, I ended up writing the solution in <a href="http://yasm.tortall.net/">yasm</a> (amd64) assembly.<br /><br /><span class="Apple-style-span" style="font-size: large; font-style: italic;">"</span><span class="Apple-style-span" style="font-size: large; font-style: italic;">Find the maximum total from top to bottom in <a href="http://projecteuler.net/project/triangle.txt">triangle.txt</a>, a 15K text file containing a triangle with one-hundred rows."</span><br /><br />So, a valid move on the graph follows a directed edge to the left or right child. An exhaustive search is $\Theta(n2^{n-1})$ with respect to height ($n$ nodes per path, $2^{n-1}$ paths). It's <a href="http://www.wolframalpha.com/input/?i=number+of+atoms+in+known+universe">estimated</a> there are $10^{80}$ atoms in the known universe. So given only a few hundred rows, the number of distinct traversals possible is roughly equivalent to the number of atoms in all of known existence.<br /><br />Obviously, a pointer based tree structure implies the complexity, since it implies an exhaustive search. However, there are more ways than one to represent a tree. One possible representation to avoid the complexity is the same layout that's used for heaps (Eytzinger).<br /><br />Finally, the dynamic programming step is comparing each of the possible traversals, and caching the optimal one as if it had been taken...<br /><br /><pre class="prettyprint lang-asm"> segment .data<br /><br />a dq \<br />0, \<br />59, \<br />73, 41, \<br /></pre>. <br />. <br />. <br /><pre class="prettyprint lang-asm">l equ 100<br /><br /> segment .bss<br /><br /> segment .text<br /><br /> global _start<br /><br />_start:<br /><br /> xor rsi, rsi<br /> mov rdi, 1<br /> mov rdx, 1<br /><br />level:<br /><br /> inc rdx<br /> cmp rdx, l<br /> cmovg rax, [a + rsi*8]<br /> jg max<br /><br /> mov rcx, rdx<br /> mov rbx, rcx<br /> shl rbx, 3<br /> <br /> inc rsi<br /> inc rdi<br /><br /> mov rax, [a + rsi*8]<br /> add [a + rdi*8], rax<br /> mov rax, [a + rsi*8 + rbx - 16]<br /> add [a + rdi*8 + rbx - 8], rax<br /> <br /> dec rcx<br /><br /> inc rdi<br /><br />next: <br /><br /> dec rcx<br /> jz level<br /><br /> mov rax, [a + rsi*8]<br /> inc rsi<br /> cmp rax, [a + rsi*8]<br /> cmovl rax, [a + rsi*8]<br /> add [a + rdi*8], rax<br /><br /> inc rdi<br /><br /> jmp next<br /><br />max:<br /><br /> dec rdx<br /> jz _end<br /><br /> inc rsi<br /> cmp rax, [a + rsi*8]<br /> cmovl rax, [a + rsi*8]<br /><br /> jmp max<br /><br />_end:<br /><br /> xor rdi, rdi<br /> mov rax, 60<br /><br /> syscall</pre><br />With dynamic programming, run-time becomes $2n^2 - 4n - 4$ or $O(n^2)$ (the math is left as an exercise).<br /><br />And final execution time is...<br /><br />real 0m0.001s<br />user 0m0.000s<br />sys 0m0.000s<br /><br />So, other than what would probably be an extremely large scale solution taking the intuitive approach, the problem is essentially infeasible. Using dynamic programming, the problem can be solved in under a millisecond. Jason Schulzhttps://plus.google.com/107454831271192954986noreply@blogger.com0tag:blogger.com,1999:blog-1515942152559754368.post-31446035526232402782012-01-25T09:48:00.000-08:002016-08-18T12:50:47.300-07:00Sony PRSI bought a Sony PRS a while ago, which I've been using a decent amount. Managing books was a bit cumbersome though. Copying anything onto it normally needs a computer, a USB connection, and management software. Thankfully, I was able to simplify it a bit.<br /><br />The PRS has a built-in 802.11g radio, so obviously data was already being transmitted over ethernet. It also has a web browser, so again it obviously supported HTTP. Normally the browser only supports text/html though, which precluded a lot of stuff I actually wanted it for.<br /><br /><a href="https://code.google.com/p/prs-plus/">PRS+</a> helped fix this. It implements a lot of the things I thought were missing from the stock firmware; specifically non-html MIME type support. Actually it integrates this with the stock browser, which even made a user friendly interface possible. Authentication and encrypted HTTP unfortunately aren't implemented, which could make using it outside of home a bit impractical, but downloading from sites like <a href="https://www.gutenberg.org/">project gutenberg</a> normally doesn't need either.<br /><br />Finally, to make everything I wanted to be able to read available via HTTP, I setup an apache webserver with <a href="https://httpd.apache.org/docs/2.2/mod/mod_autoindex.html">mod_autoindex</a>. The mod_autoindex module isn't completely necessary, but it gives nice HTML listings for directories which makes things a bit more user friendly.<br /><br />Et, voila...<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-2onA06vCn_g/Tx96kaspHRI/AAAAAAAABeU/sOHOlHED8D8/s1600/prs_plus_browser_root.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="200" src="https://4.bp.blogspot.com/-2onA06vCn_g/Tx96kaspHRI/AAAAAAAABeU/sOHOlHED8D8/s200/prs_plus_browser_root.png" width="116" /></a></div><a href="https://2.bp.blogspot.com/-YHUvE7AVDcY/Tx96lN-M_MI/AAAAAAAABek/oSNQalbtjac/s1600/prs_plus_home_new.png" imageanchor="1" style="clear: left; display: inline !important; margin-bottom: 1em; margin-right: 1em; text-align: left;"><img border="0" height="200" src="https://2.bp.blogspot.com/-YHUvE7AVDcY/Tx96lN-M_MI/AAAAAAAABek/oSNQalbtjac/s200/prs_plus_home_new.png" width="116" /></a><br /><br /><br />A big thank you to the <a href="https://code.google.com/p/prs-plus/">PRS+</a> developers, otherwise I would have been stuck. For anyone curious, it adds a lot over the stock firmware (file browser, screenshots, etc...).<br /><div class="separator" style="clear: both; text-align: center;"></div>Jason Schulzhttps://plus.google.com/107454831271192954986noreply@blogger.com0tag:blogger.com,1999:blog-1515942152559754368.post-20580198016161189912011-11-01T09:47:00.001-07:002016-08-18T12:34:55.142-07:00Algorithm ComplexityI've been going through <a href="http://projecteuler.net/">project euler</a> problems lately, and they tend to be good examples of how algorithm complexity can be important. With a lot of them, reducing complexity takes a little insight. I thought it might be interesting to walk through a solution.<br /><br />To high level refresh, the complexity of some $f(n)$ that describes an algorithm can be characterized using... <br /><br />$O$ - asymptotic upper bound <br />$\Omega$ - asymptotic lower bound <br />$\Theta$ - asymptotic upper and lower bound <br /><br />There are also... <br /><br />$o$ - exclusive asymptotic upper bound <br />$\omega$ - exclusive asymptotic lower bound <br /><br />The specific <a href="http://projecteuler.net/problem=12">problem</a> has to do with triangle numbers. All code is haskell.<br /><br /><br /><span class="Apple-style-span" style="font-size: large; font-style: italic;">"What is the value of the first triangle number to have over five hundred divisors?"</span><br /><br /><br />The obvious algorithm is to compute each triangle number by summing, and iterate over the integers from one to the number counting each divisor. The code for this is straightforward.<br /><br /><pre class="prettyprint lang-hs">main = putStrLn $ show $ (fst . head)<br /> (filter ((>500) . snd) (zip t (map divs t)))<br /> where t = map tri [1..]<br /> <br />tri n = sum [1..n]<br /> <br />divs 0 = 0<br />divs n = length $ filter (\n' -> n `mod` n' == 0) [1..n]<br /></pre><br />It's simple, and it is correct. However, it's impractical.<br /><br />Consider the complexity. The run-time to sum $1..n$ is $n$. The run-time to iterate over all possible divisors of triangle number $n$ is $\frac{n^2 + n}{2}$.<br /><br />This gives a complexity of $\Theta(n^2)$. Since by definition the value being searched for has more than 500 divisors, the complexity to test each triangle number is impractical.<br /><br />However, there is a practical solution. Furthermore, it only requires some optimization. Before eliminating the asymptotic bottleneck though, there is a simple strength reduction.<br /><br />The $\Theta(n)$ algorithm to compute $\sum\limits_{i=1}^n i$ can easily be replaced with an $O(1)$algorithm.<br /><br /><pre class="prettyprint lang-hs">tri n = (n^2 + n) `quot` 2</pre><br /><span style="font-style: italic;"><span class="Apple-style-span" style="font-style: normal;">This equation sums $1..n$. It's often</span></span> attributed to Carl Friedrich Gauss as an anecdote It reduces run-time to $\frac{n^2 + n}{2} + 1$ which is still $\Theta(n^2)$. So, again it doesn't eliminate the asymptotic bottleneck. Optimizing the algorithm to count divisors takes a bit more insight.<br /><br /><pre class="prettyprint lang-hs">import Data.List (find, group)<br />import Data.Maybe (fromMaybe)<br /> <br />main = putStrLn $ show $ (fst . head)<br /> (filter ((>500) . snd) (zip t (map divs t)))<br /> where t = map tri [1..]<br /> <br />facts :: Int -> [Int]<br />facts x<br /> | (x < head pms) = []<br /> | otherwise = h : facts (x `div` h)<br /> where h = fromMaybe x (find (\y -> ((x `mod` y) == 0))<br /> (takeWhile (<= isqrt x) pms))<br /> <br />pms :: [Int]<br />pms = sieve (2 : [3,5..])<br /> <br />sieve :: [Int] -> [Int]<br />sieve (p:xs) = p : sieve [ x | x <- xs, x `mod` p > 0 ]<br /> <br />tri :: Int -> Int<br />tri n = (n^2 + n) `quot` 2<br /> <br />isqrt :: Int -> Int<br />isqrt = truncate . sqrt . fromIntegral<br /> <br />divs :: Int -> Int<br />divs 1 = 1<br />divs x = product $ map ((+1) . length) ps<br /> where ps = group $ facts x<br /><!-----></pre><br />This algorithm is based on a simple consequence of the <a href="https://en.wikipedia.org/wiki/Fundamental_theorem_of_arithmetic">fundamental theorem of arithmetic</a>. Instead of iterating over all integers below $\frac{n^2 + n}{2}$, it counts the combinations of prime factors. This replaces the $\frac{n^2 + n}{2}$ algorithm to find divisors with a $2\sqrt{\frac{n^2 + n}{2}}$ or $O(n)$ algorithm, since it only iterates over the primes below $\sqrt{\frac{n^2 + n}{2}}$<br /><br />The realized run-time actually has a tighter bound, since prime numbers tend to grow more <a href="https://en.wikipedia.org/wiki/Prime_number_theorem">sparse</a> further on the number line. However, knowing a bound of $O(n)$ is sufficient given the problem.<br /><br />The execution time for the final version is...<br /><br />real 0m0.037s<br />user 0m0.034s<br />sys 0m0.002s<br /><br />There is still room for micro-optimization (starting at 14410396, better cpu utilization, etc...). However, these would likely only give constant factor speedups. More importantly though, optimizing the complexity turned the impractical algorithm, into an algorithm that finds a solution in less than a second.<br /><div class="separator" style="clear: both; text-align: center;"></div>Jason Schulzhttps://plus.google.com/107454831271192954986noreply@blogger.com0