error:14077458:SSL routines:SSL23_GET_SERVER_HELLO:reason(1112)Chances are your Apache "ServerName" config is different (e.g. localhost) from what's defined in your SSL certificate (e.g. mycompany.com).
Proof of Computational Work
I was having dinner with a good friend and former co-worker a few weeks ago and we ended up philosophizing about the things we were doing at LimeWire. An important part of our day to day work was detection and prevention of Spam on the Gnutella Network.
I believe this idea is somewhat related to Bitcoins that use computational complexity for proof of work. Hashcash
The first question you should ask yourself is why Spammers do what they do? They do it because it's profitable and cheap. What if we make it (possibly very) expensive for them? CAPTCHA are one method to do it by introducing Human labor to the process but what if that is not an option or not desired?
The idea is to force clients to do work for us and have them provide some proof of their work. We can do this by introducing a secure Hash-Function to the system and requiring that the output of the Hash-Function meets some pre-defined requirement. The requirement can be something like the 20 bits of the hash must be zero. This can be achieved by appending (possibly) random padding information to the original message and keeping doing it (work - O(n) or worse) until some data has been found that yields to an acceptable hash. It then sends the original message along with the padding information to the server and the server can verify it in one step by using the same Hash-Function (proof - O(1)).
Here is a diagram of the workflow:
The number of bits and the complexity of the Hash-Function can be used as a crank to drive the cost up and down. I would also recommend using some random one-time token that is provided by the server (state in the session context) to make sure that messages can't be sent multiple times. In case of Web-Apps you may even use the fact that they've to run your JavaScript for your advantage.
No more simple spamming with curl & Co. Your users will ideally notice nothing but for Spammers it's translating directly into CPU Hours and hard cash.
Simple PATRICIA Trie
One thing I've noticed in the emails with my PATRICIA Trie's users is that they've often very specific use-cases and requirements. It seems our attempt to make the Trie generic and cramming everything under a Map + SortedMap interface is maybe not the right approach. A lot of our implementation complexities are a result of the requirements these two interfaces impose and they make it very hard to change the code. What if we relax a few things, don't try to be 100% according to specification, don't try to make all methods super efficient and just cherry pick some features from SortedMap, NavigableMap?
I started off with a PATRICIA Trie as described in Robert Sedgewick's Algorithms in Java. It's not the most practical implementation but a good starting point to refresh one's rusty memory. This was followed by a new PATRICIA Trie implementation where I was focusing on getting the most important Trie operations right: put(), get(), select() and traverse(). An important decision was to not optimize the remove() operation. It's relatively complicated and would require two additional fields per Node (parent + predecessor) to make it efficient. An another decision was to implement the Map interface (to get some compatibility with the Java Collections) but be relaxed about some operations such as entrySet(), keySet() and values(). They return read-only copies of the current state. That means you can't use them to modify the underlying PATRICIA Trie and they're not being updated if something happens to the Trie.
The last step was to take the new PATRICIA Trie and do something with it. One thing users have been asking me is how to use primitive types for keys for example. They want to use raw int values. With the new PATRICIA Trie it's a very simple task and it took me about 30 minutes to make the required changes. This breaks certainly all kinds compatibilities with Java Collections but that is OK because it's a very specific use-case.
The code for all three PATRICIA Trie versions can be found in a new repository. Consider it a toolbox for custom PATRICIA Tries. Enjoy!
Discuss your Lunch
ANN: Roger's New Lunch Policynow that our team is complete I would like to announce my new lunch policy.
I've decided to no longer eat lunch in the office or at places that have no proper seating. -- Roger Kapsi 09/29/2008
This is an email I sent in September, 2008 to everyone at Livestream (formerly known as Mogulus) after I had been working for a year straight. My personal goals were to get out of the "not so great" office (an understatement) and eat some quality food for lunch. I din't care about the weather or if people would join me or not.
Some co-workers started to join me on a regular day-to-day basis. The most notable people are Ben, James, Rick and Paul. And let me tell you, we had some ridiculous lunches together.
But this blog post is not about our lunches. It's about our (Paul, Ben, Rick, James' and me) brain child called Lunch Discussion. It's a place to discuss, plan and rate your lunches with others (e.g. co-workers). The first step is to create a list of places you go for lunch. Users can pick places from the list and discuss whether or not they want to go there. By the time you come back it will ask you whether or not you went for lunch, where you went and if you liked it. The people who picked the place(s) you ended up eating at receive Karma points and you receive credit for eating lunch.
Over time you're building up collective and individual statistics such as: Best Restaurant, Most Eaten At, Best Picker and Fattest User. The statistics stay and the discussions get deleted for good every night. Each day is a new lunch discussion and you can explore the limits of Free Speech.
You need a Gmail account to get started. Just go to www.lunchdiscussion.com/signup and login.
Special thanks and extra Karma points go to Paul for taking the idea and turning into reality.
UPDATE: Paul released the source code on github.com. Enjoy and happy hacking!
A big flaw in Java's nanoTime()
So what about System.nanoTime()? It seems to return a linearly increasing number of time stamps for the next 292 years from the point you turn on the computer. The JavaDoc doesn't mention any potential problems besides that nanosecond precision is maybe not supported by the underlying operating system. It should be sufficient to measure time or to timestamp events.
It turns out nanoTime()'s behavior is actually quite unspecified and it's unclear what the clock's expected properties are. Chances are it's using the Time Stamp Counter (TSC) and modern computers have one of them for each core. It's up to the underlying Operating System or Hardware to keep these TSCs in sync (or not) or provide some other high precision clock etc. etc. From the developers perspective it's completely unclear what nanoTime() is actually returning.
There are a few things that can happen if nanoTime() is based on TSCs and if there are multiple CPUs or COREs at play or if the underlying Operating System is messing with the CPU/CORE frequencies (e.g. switching to a lower clock speed to save energy). This can be best seen if you're trying to measure time by doing two subsequent nanoTime() calls and something happens in between these calls. Let's take a look at the following example.
while (true) {
long startTime = System.nanoTime();
// Let's hope the current Thread is switched to an another CORE or
// enters some power saving mode that lowers the TSC's frequency
Thread.yield();
long time = System.nanoTime() - startTime;
assert (time >= 0L);
}It's pretty simple and straight forward. We're simply measuring the amount of time that elapses between the two nanoTime() calls and the yield() tries to enforce a context switch (hopefully to a different CPU or CORE). In practice you've to run the code on multiple Threads to catch these fluke events but I've managed to get negative time values in my experiments*.
The code in my project looks actually more like this (the subsequent nanoTime() calls are quite "far apart" from each other) and I see these fluke events once or twice an hour.
ExecutorService executor
= Executors.newCachedThreadPool();final long startTime = System.nanoTime();
executor.execute(new Runnable() {
@Override
public void run() {
doSomething();
long time = System.nanoTime() - startTime;
assert (time >= 0L);
}
});
You may ask what the outcome is? Well, the outcome is that you've to deal with the exact same problems as with currentTimeMillis(). The problems have very different origins but at the end of day you'll see the same errors (time is leaping back and forth). So neither can be really trusted and if you don't need the additional precision you may equally well stick with currentTimeMillis() to get nice things such as the fact that it's based on the UNIX System Time.A few links that might be useful: [1] [2] [3] [4] [5]
* MacBook, Mac OS X 10.6.5, Java 1.6.0_22 and Intel Core 2 Duo
DHT Security
One problem we were facing was that an attacker could spawn a bunch of malicious Nodes and pick their IDs strategically (due to lack of central control) so that they would take over a certain key-space and make the affected section of the DHT essentially unusable or let's say not trustworthy.
The simple idea I had to address this problem was to limit the number of hosts coming from the same Class-C Network per bucket. Once a certain threshold was reached the routing table would refuse to add a new host from the affected subnet and an attacker would have to spread among many subnets to pull this off.
It's great to see that this simple idea is living on in gtk-gnutella and I've generalized it since then and implemented it as a standalone class in my ardverk-base library.
Async Futures
An API extension for Java Futures and Non-Blocking + Asynchronous I/O (and possibly other use-cases).
AsyncFutures are something I came up with during my time at LimeWire when I was working on the Mojito DHT and they are conceptually very similar to Apache MINA's IoFutures. The very basic idea of an AsyncFuture is that it decouples starting and returning a value from each other. A simple example would be sending a message from one Thread and receiving the response from an another Thread. With a standard Future one would have to put the first Thread into waiting state before the Runnable or Callable objects return, wait for the response and notify the waiting Thread when the response arrives. It works but doesn't scale very well if large amounts of messages need to be sent as each request will require its own Thread.
In addition, AsyncFutures have built-in support for timeouts and callbacks. Timeouts ensure that all operations are deterministic in terms of the time they need to complete and callbacks enable the user to receive completion events instead of having to wait for the Futures to complete.
NOTE+CAUTION: An AsyncFuture is not an 100% compatible drop-in replacement for Futures. To be more precise, the locking methodologies of AsyncFutureTask and FutureTask are fundamentally different which leads to a slightly different AsyncFuture behavior if it's used outside of its originally intended use-case context (non-blocking computations).
UPDATE: As of Version 0.4 AsyncFutureTask may be used as a drop-in replacement for FutureTasks.
For further information and a download link see here: Async-Future
Bee-Encoding (Bencoding)
Bencode (pronounced "Bee-Encode") is the encoding used by the peer-to-peer file sharing system BitTorrent for storing and transmitting loosely structured data.It supports four different types of values:
Bencoding is most commonly used in .torrent files. These metadata files are simply bencoded dictionaries.
- byte strings,
- integers,
- lists, and
- dictionaries (associative arrays).
While less efficient than a pure binary encoding, bencoding is simple and (because numbers are encoded in decimal notation) is unaffected by endianness, which is important for a cross platform application like BitTorrent. It is also fairly flexible, as long as applications ignore unexpected dictionary keys, so that new ones can be added without creating incompatibilities. -- Wikipedia.org
I have just released a very simple implementation of a Bencoding Input- and OutputStream in Java. It's a very lightweight implementation with only two classes and it doesn't depend on any external libraries.
I use it in early stages of (Network) application development to test my prototypes because it's much easier to use than for example Google's Protobuf.
Canceling ScheduledFutures (Memory Leak)
In short, ScheduledFuture.cancel() or Future.cancel() in general does not notify its Executor that it has been cancelled and it stays in the Queue until its time for execution has arrived. It's not a big deal for simple Futures but can be a big problem for ScheduledFutures. It can stay there for seconds, minutes, hours, days, weeks, years or almost indefinitely depending on the delays it has been scheduled with.
Here is an example with the worst case scenario. The runnable and everything it's referencing will stay in the Queue for Long.MAX_VALUE milliseconds even after its Future has been cancelled!
public static void main(String[] args) {
ScheduledThreadPoolExecutor executor
= new ScheduledThreadPoolExecutor(1);
Runnable task = new Runnable() {
@Override
public void run() {
System.out.println("Hello World!");
}
};
ScheduledFuture> future
= executor.schedule(task,
Long.MAX_VALUE, TimeUnit.MILLISECONDS);
future.cancel(true);
}You can see this by using a Profiler or calling the ScheduledThreadPoolExecutor.shutdownNow() method which will return a List with one element in it (it's the Runnable that got cancelled).
The solution for this problem is to either write your own Future implementation or to call the purge() method every now and then. Luckly we've been always using a custom Executor factory at work and it was just a matter of doing something simple as:
public static ScheduledThreadPoolExecutor createSingleScheduledExecutor() {
final ScheduledThreadPoolExecutor executor
= new ScheduledThreadPoolExecutor(1);
Runnable task = new Runnable() {
@Override
public void run() {
executor.purge();
}
};
executor.scheduleWithFixedDelay(task, 30L, 30L, TimeUnit.SECONDS);
return executor;
}
PATRICIA Trie
A radix tree, Patricia trie/tree, or crit bit tree is a specialized set data structure based on the trie that is used to store a set of strings. In contrast with a regular trie, the edges of a Patricia trie are labelled with sequences of characters rather than with single characters. These can be strings of characters, bit strings such as integers or IP addresses, or generally arbitrary sequences of objects in lexicographical order. Sometimes the names radix tree and crit bit tree are only applied to trees storing integers and Patricia trie is retained for more general inputs, but the structure works the same way in all cases. -- WikipediaThis blog post is not about the technical details of the PATRICIA Trie but about the historical background of the implementation I've just released on Google Code.
Some time in late spring of 2005 I was skimming through Robert Sedgewick's book "Algorithms in Java" and the PATRICIA Trie caught my eye (Chapter 15.3, Page 650). I thought it's a neat Algorithm and started to work on an implementation that extends the Map interface. The remove operation turned out to be very tricky. I couldn't find any books or other implementations that would either explain how to do it or do it more efficiently than O(n) (i.e. rebuilding the Trie). I spent quite a few nights with figuring out how to do it and came up with a couple of solutions. It's probably worth a dedicated blog post.
I sent it to LimeWire along with a patch but they didn't use it before I started my internship at LimeWire in late 2005. The same summer we were talking about DHTs and in particular Kademlia. I kind of figured that PATRICIA can be used for the routing table but it was too early to talk about it and I was too busy with writing my final paper for school but that is how the Mojito DHT [1] project was eventually started in December 2005.
Sam Berlin returned in 2006 to LimeWire and he was looking for ways to filter IP-Addresses very efficiently. We were facing the problem that each Message needed to be matched against a huge blacklist of ranges with roughly 100k entries.
That is how he got interested in PATRICIA. He extended it by the SortedMap interface and more importantly he figured out how to implement the various views like entrySet or the Iterators without having to pre-traverse the entire Trie.
We decided to contribute it to the Apache Commons Collections and Google Collections Library but they are quite slow with picking up new things.
I have therefore started a new Google Code project that is hosting the most recent version of the Trie. You can find and download it from here.
BestBuy is Pathetic!
I decided to buy my very first TV yesterday. It took me only 28 years! I did some research and figured the Samsung LN37A550 is the one I want.
I found one at the 23rd and 6th store in NYC. The box is somewhat big (mainly because of the pre-mounted stand) and a BestBuy employee helped me to carry it to the cab. As a matter of fact, he was carrying it and I did absolutely nothing.
At the door we were stopped by this pathetic moron who wanted to see my receipt. Wtf??? Honestly, wtf? How stupid must one be???
A little reality check:
- I've just spent more than $1000 in your store
- The box is being carried by a BestBuy employee
And you dare to ask me for the receipt and treat me like thief? Wtf is wrong with you people? That was the last time I bought something at BestBuy!
Ubuntu 8.04 and Multiple Monitors
With the ATI Restricted Drivers comes a tool called aticonfig which makes it very easy to setup two monitors. The only little problem I've noticed is how the Mac Pro/Mac OS X and the ATI Restricted Drivers/Ubuntu count the DVI ports. They count them in opposite directions.
If you turn on your Mac Pro only one display will be on at first. That is the main screen if you want. Once Mac OS X is up it will show the login dialog on the same screen. The ATI Restricted Drivers/Ubuntu count apparently in the other direction and the login dialog shows up on the other screen.
In my case, if I turn on my Mac Pro the left hand screen turns on and if I boot into Mac OS X the login dialog will be shown on the same screen. However, if I boot into Ubuntu the login dialog will be shown on the right hand screen.
I haven't been able to "fix" that and I had to drag-drop Gnome's top and bottom menu bars to the other screen. It's not a big issue though (just annoying).
Anyways,Â
cd /etc/X11sudo cp xorg.conf xorg.conf.baksudo aticonfig --dtop=horizontal
sudo aticonfig --dtop=horizontal,reversed
