Friday, June 13, 2014

A real-world crowdfunding preview for P2P OS?

I've been waiting for some time to find a kickstarter project comparable to P2P OS in terms of potential contributors base, and now i think i found one that might be used as a good reference: http://www.kickstarter.com/projects/mmv/console-os-dual-boot-android-remastered-for-the-pc. My guess is that P2P OS should be able to get at least the same level of funding as this project (considering the two projects' target audiences), so it'll be interesting to see just how much this project will end up with in its pockets.

PS
Okay, there's a catch to this comparison: ConsoleOS actually trades something with their backers, they'll give them free-for-life updates for a $10 contribution. With P2P OS on the other hand, i didn't really think about a mercantile proposition yet (in part because at the time i'm writing this i'm not even sure i want to go this crowdfunding route).

Thursday, June 5, 2014

It's June 5th, time to reset the net (or somethin')


There's Alexis making some noise here: SaveYourPrivacyPolicy.org

and you have Eduard beating his drums here: ResetTheNet.org

and then some more noise here: FightForTheFuture.org

and... well, you get the drill.


Friday, February 7, 2014

Critical milestone reached: framework-independent cross-platform app development got the green light

It's been about two months since my libposif-0.9 library started to show its first signs of life, and now i have my first libposif-1.0 with [what i believe it is] a stable API. When i first defined the libposif API it wasn't totally clear to me how it will integrate in a message loop-based environment (such as is the case with all modern GUI frameworks, e.g. GTK, Borland, MS, Qt), so during the past couple of months i had to re[de]fine the API specifications and make the required implementation changes.

Well, libposif-1.0 is now here, together with my first fully-functioning libposif-1.0-based GUI application. By far the #1 candidate for a host environment is the Qt framework because it's both LGPL-ed and multi-platform, but by no means does a libposif-based application rely on any of the Qt-specific mechanisms (no signal/slot, event loop, etc dependencies whatsoever): in fact, all the fundamental multi-threading and messaging-related mechanisms available in Qt have have a [somewhat equivalent, or more sophisticated] native C++11 implementation in libposif, and all that is required for having a libposif-based application integrated in any host environment, may it be event loop-based or not, is a sequence of three next-to-trivial steps:
  1. Derive a host environment-specific class from a libposif-defined "MessagingInterface" abstract class - e.g. "MyMessagingInterface": this class is the host environment's messaging interface with the application, and it implements [pre-processing and] relaying of application-defined, environment-agnostic messages sent by back and forth between the libposif application's I/O buffer and the native environment's components (windows/widgets/etc)
  2. Instantiate a MyMessagingInterface object inside the host environment, e.g. by including it in a GUI application's main window or inside an invisible form, etc
  3. Cross-link the native environment's objects (i.e. windows, forms, dialogs, etc) with the MyMessagingInterface object (nothing weird or hi-thech here, just some plain old pointer cross-referencing between the host environment's native objects and the MyMessagingInterface object, say 10 minutes of typing some pointer variable declarations and some pointer assignments).
Sure enough, the list above is just a power point list and nothing more, but the point is that that's really all there is to integrating a libposif-based application into any host environment; once the above procedure has been walked though, the libposif application's output messages will trigger methods in the host environment, and host environment's events will be pushing messages to the libposif application's input buffer.

And because a picture's worth a thousand words, here's how a libposif-based application integrates in a host environment...


...and here's a screenshot of my test application, integrated in a Qt-based GUI:


A few words about the test application above and how it is internally implemented by making use of libposif's features:
  1. first of all, it is a proof of concept for an event loop-based environment integration:
    • the two "Send to" buttons in the main window actually send a message to the application, and the application sends back the message to the GUI which directs it to the proper window (as specified in the message)
    • all the other buttons send messages to the application, which then relays them to the appropriate Automaton for processing
  2. tests the threading/automata model, namely:
    • each counter is implemented as an Automaton in a separate Tread
    • each counter Automaton is implemented by having it send a scheduled message targeted to itself, with the schedule specifying that the message is to be actually sent with a specified delay after the SendMessage() method invocation (scheduled messages and targeted messages are examples of features that do not have a Qt equivalent)
      • counter 1 in the Main window sends a message to itself with precisely one-second delay, thus making the counter increment once/second
      • counter 2 in the Main window sends a message to itself with a three second average delay, and with a specified delay dispersion (in %), i.e. the messages will actually be sent with an average, but not precise, 3 second delay (e.g. for a 40% dispersion the successive delays might be e.g. 3, 2, 2, 4, 3, 4... seconds ), which will cause the counter to be incremented on average, but not precisely, every 3 seconds
    • the two counters can be started/stopped from their corresponding Start/Stop buttons: each button sends a message to the application, which then relays the message to the corresponding counter Automaton
    • counter 1 broadcasts a message to all the Automata in the application every time it increments; the libposif method for having an Automaton broadcast a message is BroadcastMessage(), and it is roughly equivalent to Qt's signal()
    • the two Reg/Unreg buttons register/un-register a connection between the "tick" messages broadcasted by the counter 1 Automaton and a corresponding "divider Automaton" represented by the yellow counters to the right of said buttons, where each "divider Automaton" is dividing the incoming "tick"s (broadcasted by counter 1) by the number specified in its corresponding drowp-down list (i.e. by 1 and by 2 in the screenshot above); the libposif methods for registering/un-registering an Automaton as a listener to a broadcasting source are AddMessagingRoute(src, dest) / RemoveMessagingRoute(src, dest), and are roughly equivalent to Qt's connect(src, dest) / disconnect(src, dest) methods
  3. finally, the Main window's "Show Clock Form" menu item illustrates how a message can be sent by a window (namely the Main window) to another window (namely the Form window) without going through the application's Messaging Interface, i.e. this is an internal Qt message which never leaves the Qt-based GUI
And because talk is cheap, here's a code snippet that illustrates how a counter automaton is actually implemented by having it send a scheduled message back to itself:
  • onMessageReceived() is a predefined virtual function of libposif's Automaton abstract class, and it is triggered (by libposif's internals) each time an Automaton object receives a message (this method has to be implemented by each specific automaton that is derived from libposif's Automaton abstract class)
  • SendIntercomMessage() sends a scheduled message to a specified Automaton (which can be - and in the code snippet below is - the sender Automaton itself)
  • BroadcastMessage() broadcasts a scheduled message to all the automata in the application, and the message will be actually received (and processed) by any/all other Automaton objects that have been set up to listen to the sender Automaton (via AddMessagingRoute())
  • SendIOMessage() sends a message to the application's Messaging Interface
  • State is the Automaton's [integer] state variable: it is a mandatory component
    of any automaton, declared in the Automaton base class

int MyClockAutomaton::onMessageReceived(
   const message_t& msg, 
   const alphanum_t& sourceThreadId, 
   const alphanum_t& sourceAutomatonId) 
{
   switch (State) {
   case counter_on:
      if (msg==CLOCK_TICK) {
         assert(sourceThreadId==parentThread()->threadId() &&
                sourceAutomatonId==automatonId());
         counter++;
         SendIOMessage((message_t)PRINT_COUNTER<<counter);
         BroadcastMessage(DIVIDER_TICK);
         SendIntercomMessage(CLOCK_TICK,"","",rate); //tick
      }
      if (msg==STARTSTOP_COUNTER) {
         assert(sourceAutomatonId=="" && sourceThreadId=="");
         State=counter_off;
      }
      break;
   case counter_off:
      if (msg==STARTSTOP_COUNTER) {
         assert(sourceAutomatonId=="" && sourceThreadId=="");
         State=counter_on;
         SendIntercomMessage(CLOCK_TICK,"","",rate); //start
      }
      break;
   }
   return 1;
}


Well, all in all it's been a bit long (and occasionally bumpy) a road to reaching this point, but given what i know is needed to implement the P2P OS algorithms, there was simply no way of cutting corners: i needed an asynchronous computing framework to implement autonomous agents that talk to each other, i needed this framework to be exclusively standards-based in order to be truly cross-platform, and i wanted complete control over how my applications will integrate in any host environment in order not to rely on any particular host framework (may it be open-source or not). And libposif-1.0 does just that. So, finally, i'm now ready to start working on P2P OS itself.

PS
It is my intention to release libposif as a stand-alone open-source module, but before doing that i'll have to write at least a brief documentation for its API, and i have no clue when/if i'll find enough energy to do that. Until then, as always, anyone interested just drop me a line.

Saturday, December 28, 2013

IPv6 adoption in exponential growth stage

This is the third year in a row that sees IPv6 adoption growing exponentially, with a factor a bit over 2:1 year-on-year (2011 -> 2012 -> 2013). Since we're in the early deployment stage, this trend can be expected to continue for at least several years (and maybe with an even larger factor), such that by the end of 2015 a 10% IPv6 penetration is most likely a conservative estimation.


The IPv6 penetration, even in low numbers, is important for this project because IPv6 can be used to do the heavy lifting in the routing ring, such that the more IPv6 connections are deployed, the less the bandwith strain on the routing ring peers.

Monday, December 9, 2013

Cross-platform multi-threaded foundation library

After trying to probe the future for over a year for potential show-stopper problems, a few months ago i basically decided that i gathered enough information to effectively start coding on P2P OS with [what i think it is] a pretty good chance of having covered all the major issues that might stop me dead in my tracks, and today is the day i can proudly announce the first working version of a foundation library that i'll be using for all P2P OS development: say hello to my new shiny libposif-0.9

So why another library? Well, to make the long story short, i had three main reasons for this:
  1. i wanted an independent (and free, at least LGPLed) library. Sure enough, there are quite a number of LGPLed libraries out there, but none of them is modular- and minimalistic enough for my taste: what i wanted was a library that has no dependencies on anything else than its host OS' kernel API (and even these dependencies must be fully POSIX-compatible) - or, if it does have some dependencies, those dependencies should be very easy to eliminate
  2. i wanted a cross-platform library. Again, there are many cross-platform libraries in the wild, but they come in bloated bundles from which it's very hard to extract only the modules one actually needs for a specific application
  3. finally, i wanted a standards-based library (except only for a minimalistic POSIX-compatible OS API, which should itself be encapsulated into a cross-platform wrapper): this means e.g. i can use C++11's std::threads but not pthreads, i can use a [platform-abstraction wrapper for] an "mkdir" command but i cannot use a Recycle Bin API, i can use HTML5 for a GUI (or text consoles for text-only UIs) but not graphical widgets or non-standard multi-media components (whether they are wrapped into a cross-plaform library or not), etc
So now with listing the goals out of our way, here's a brief description of what libposif is all about:
  • Messaging-based multi-threaded processing: i called this library module "libposif-mt", and what it does is it allows a program to:
    • organize processing into multiple "Tasks", where each task is a collection of one or more execution "Threads". Each task can start any number of threads, where a libposif "Thread" corresponds to an OS thread (i.e. it's a C++11 std::thread)
    • each "Thread" groups any number of "Automata", where each automaton is an independent state machine that can send messages ("SendMessage()") to other automata (may the destination automaton be part of the same thread, or in a different thread in the same task, or in another task altogether) and is notified via a callback function ("onMessageReceived()") of incoming messages sent by other automata
    And here's a picture of it all:


    To explain why i need this kind of functionality, i'll quote from an interminable doc in which i gathered all the various things that need to be implemented:
    • each router in a node pings all the other routers in its node every ~1 minute, and it updates its own node image and its live routers list
    • if a pinged router does not respond to a ping then another 2 pings are immediately retried at 10s interval, and if 5 successive ping “sequences” (i.e. 1 ping+2 retries) fail on a router, or if a router informs the ping sender that it has left the node, then said router is marked as offline in the ping sender's live router list and it will no further be pinged
    • when a router detects 5+/10 routers in its own node as offline, it sends to the server its list of offline routers with a rate limiting scheme; multiple such messages coming from different routers in a node will eventually trigger a node image update on the server
    The quote above is by no means intended to shed any light on the inner workings of an algorithm, but rather it's meant to show that the algorithm is completely asynchronous (when/then-based instead of if/then), i.e. it begs for independent inter-connected state machines that simply track some state conditions, exchange messages with one another, and change their state when a given situation occurs - and this is exactly what libposif's "Automaton" does.
    • note that the above algorithm's mechanics are very similar in nature to how the data chunks are handled in torrents, i.e. each file in a torrent is processed independently, a file is defined as consisting of blocks which are themselves processed asynchronously, etc, and depending on various [asynchronous] conditions associated with a file, or a block, etc, the torrent client takes a specific course of action
  • Portable file system interface: i called this module "libposif-fs", and it's basically implemented using POSIX functions plus several OS-specific commands which are not defined in POSIX (i didn't want to use boost, it's way too bloated a library for my taste, so i'll just wait for the file system functions to be included into std:: before using them). The big deal about this library module is that i tried to make it [reasonably] safe for program development, i.e. i tried to minimize the risks of seeing my "windows" directory vanish because i'm sending an empty path to a "rmdir" function and the likes
    • in brief, libposif-fs declares a "Sandbox" class which has to be initialized with a "base path", and any and all file operations are methods of a Sandbox object and are confined to the base path (and its sub-directories) of the Sandbox object that they use; equally important, the base path is thoroughly tested against critical system paths and against a user-definable set of "pathNotAllowed()" set of rules when a Sandbox object is created, such that with a little bit of care (when initializing a Sandbox object's base path) the potential for damaging other applications' files is really slim
  • UDP sockets: this one is unsurprisingly called "libposif-udp", and i used Qt's QUdpSocket library for its cross-platform implementation (this is a good example of using a thrid-party cross-platform library without critically relying on it because i'm not using Qt's event loop-based signal/slot mechanism or any other fancy stuff - just calling QUdpSocket's plain-vanilla read/write methods) 
  • HTTPQueryServer: part of the "libposif-tcp" module, this component is a minimalist HTTP server intended to be run on the localhost and serve as the backend for HTML5-based GUIs (see the HTML browser-based client/server GUI model description here). In order to allow multiple simultaneous Ajax connections from the browser to multiple server sockets on the localhost, the HTTPQueryServer object implements the CORS specification
  • Miscellaneous networking functions: this "libposif-netmisc" module contains a collection of networking functions such as enumerating the local host's IP addresses (IPv4 & IPv6), performing DNS lookup and reverse DNS, etc, and it's implemented using the Qt library (it's just a wrapper over the corresponding QtNetwork functions)
  • UPnP client: "libposif-upnp": i just grabbed miniupnp for this, so not much to say about this one since it's as cross-platform a library as it can get
  • Firewall controller: part of the "libposif-fwl" module, this is a platform-dependent component implemented as a wrapper object over whatever firewall is installed on the system; for the time being i only wrote a netsh wrapper for plain vanilla windows, but it's all dead simple to extend (just add some extra files and configure the library to point to them)
    • just for the purpose of illustration, here's how the library is configured to compile for windows with [a wapper over] windows' netsh.exe's firewall controller:
      #define libposif_fwlctrl_h\
        "libposif-fwl.src/netsh-exe/netsh_exe.h"
      #define libposif_fwlctrl_cpp\
        "libposif-fwl.src/netsh-exe/netsh_exe_win.cpp" 
      Now suppose i want to add support for a linux firewall controller to libposif: this will involve writing/grabbing a linux firewall controller, packing the source code in a sub-folder e.g. "iptables-ctrl" of the firewall controller's source modules folder "libposif-fwlctrl.src", and then when i need to compile the library for linux (and use this firewall controller implementation) i'll just need to set the firewall #defines in the library configuration file point to this implementation:
      #define libposif_fwlctrl_h\
        "libposif-fwl.src/iptables-ctrl/iptables_ctrl.h"
      #define libposif_fwlctrl_cpp\
        "libposif-fwl.src/iptables-ctrl/iptables_ctrl_lin.cpp"

So, to rise, this is where i stand right now: i have a 20-something-page document where i gathered all the most minute details of the algorithms involved in P2P OS (network policies, client, distributed server, software protection, etc), i now have this libposif foundation library to build upon, so i guess the next big thing should be a glorious post about the first piece of code that will actually do something :)

PS
I'll most likely be expanding this library with new functionality over time, but it's probably not worth it writing a new post each time i'll do this (well, unless it's something that i'll deem spectacular enough to warrant a separate post), so i'll just keep silently updating this post in the background as i'll add new modules and/or features.

PPS
In other news, i completely switched to Qt Creator, and after playing around with it for the last several months (and after quite a number of bugs and idiosyncrasies have been fixed during this time) i can now recommend it for any serious cross-platform standards-based development (there still are a few rough edges to be polished here and there, but it's already usable as it is). Here's a glimpse of my new Qt Creator 3.0 desktop in all its glory:


So good bye Borland Builder, you served me well for over 15 years, but the days of closed source software extortion are pretty much over. Nice to meet you Qt, and have a wonderful life.

Thursday, August 29, 2013

Distributed server: any DHT will do, right? Wrong.

After diving into DHTs a while ago, i first thought i had it all figured out: DHT is the name of the game when it comes to distributed servers, or, at the very least, they are an appropriate and mature solution for providing a distributed routing service. And apparently that is indeed the case, but with a caveat: all the common DHT algorithms presented in the literature are highly unreliable in a high-churn rate [residential] end-user-supported P2P network. More specifically, what all common DHT algorithms (that i know of) lack is on one hand enough redundancy to cope with the kind of churn typically found in end-user P2P networks (where users frequently join and leave he network, unlike in a network of long-lived servers), and on the other hand they are not sufficiently resilient to face the kinds of concerted attacks that can be perpetrated in a P2P network by a set of coordinated malicious nodes.

To make the long story short, the conclusion for all this was that building the P2P OS distributed server by simply copy-pasting an existing DHT algorithm is a no-go, and this sent me right back to square one: "now what?"

Well, the breaking news story-of-the-day is that i think i found a way to strengthen DHTs just enough to make them cope with the high churn problem, and, together with the obfuscated code-based "moving target defense" mechanism, i might now have a complete solution to almost all the potential problems i can foresee at this stage (specifically, there is one more problem that i'm aware of that is still outstanding, namely protecting against DDoS attacks, but apparently there are accessible commercial solutions for this one also; i'll talk about this in another post after i'll do some more digging)

Without getting into too many technical details at this point (primarily because all this is still in a preliminary stage, without a single line of code being written to actually test the algorithms involved), the main ideas for an "improved DHT" are as follows:
  • use a "network supervisor" server which, based on its unique global perspective over the network, will be responsible for maintaining a deterministic network topology, all while also keeping the network's critical parameters within acceptable bounds
  • add redundancy at the network nodes level by clustering several routers inside each node: in brief, having several routers inside a node, coupled with a deterministic routing algorithm (as enabled by the deterministic topology of the network), should provide a sufficient level of resilience to malicious intruders such as to allow the network to operate properly
Sure enough, the points listed above are just the very top-level adjustments that i'm trying to make to the existing plain-vanilla DHTs, but there are quite a lot of fine points that need to be actually implemented and tested before celebrating, e.g. the iterative routing algorithm with progress monitor at each step in the routing process, having multiple paths from one node to another supported by a backtracking algorithm, node state monitoring and maintenance by the supervisor server, etc - and these are just a few examples of the issues that i am aware of.

At the end of the day, when all pieces are put together the overall picture looks something like this:


So basically this is how far i got: i have this "supervised network" architecture which i think might be a solution for a sufficiently resilient and reliable distributed server, and i have the code obfuscation-based network integrity protection, but now i need to test these thingies the best i can. I definitely won't be able to test a large-scale system anywhere near a real-life scenario until actually deploying it in the wild, but a preliminary validation of its key features taken one by one seems feasible.

PS
The network monitoring/maintenance algorithm, the node insertion/removal procedures, etc, are all pretty messy stuff that i still have to thoroughly double-check before actually diving into writing code -- e.g. here's a sneak preview for how a new node is inserted in, and announces its presence to, the routing ring:

  • the blue nodes are "currently" existing nodes positioned in an already-full 23-node ring (i.e. 000::, 001::, 010::, 011::, 100::,, 101::, 110::, 111:: in the image above, where '::' means all trailing bits are 0)
  • the yellow nodes encircled in solid lines are nodes that have already been inserted in the yet-incomplete 24-node ring (the yellow nodes are interleaved with the existing 23 blue nodes in order to create the new 24-node ring)
  • the red node is the node that is "currently" being inserted in the routing ring (more specifically, in the yellow nodes "sub-ring" at index 0111::, i.e. in between the [already existing] blue nodes 011:: and 100::)
  • the yellow nodes encircled in dashed lines are nodes that will be inserted in the [yet-incomplete] yellow nodes ring after the "current" insertion of the red node is completed
  • after the yellow sub-ring will be completely populated (i.e. there will be a total of 24 [yellow and blue] nodes in the routing ring), the routing ring will be expanded to 25 nodes by inserting new nodes in between the existing [yellow and blue] nodes of the 24-node ring, a.s.o.; i.e. the routing ring always grows by creating a new sub-ring of "empty slots" in between the existing nodes, and incrementally populating said empty slots with new nodes

Thursday, August 1, 2013

Been stuck for several months, but now i might be on to something

As i explained in an earlier post, there are several classes of internet connection that a user may have in the real world, but for the purpose of this discussion we shall simplify the categorization in only two [top-level] "meta-classes":
  • 'good' internet connections: these connections allow a peer to have direct P2P connectivity with any other peer on the network; and
  • 'leech' internet connections: these connections only allow two peers to connect to each other by means of a relaying peer, where said relaying peer must have a 'good' connection in order to be able to act as a relay
As it can be seen, any two peers with 'leech' connections will have to rely on a third-party relaying peer with a 'good' connection in order to be able to connect to each other.

In other words, there are real-world objective reasons that will prevent all peers from being equal on the network: 'leeches' will always require assistance from 'good' peers, while they will be truly unable to assist other peers on the network in any way (because of their objectively problematic internet connection)

The problem (that got me stuck for over four months):
In the real-world internet, the ratio between 'good' internet connections and 'leech' connections is (by far) sufficiently high to enable a cooperative self-sustained P2P network, i.e. there are enough 'good' peers that can provide relaying services to the 'leeches' upon request. HOWEVER, the very fact that there is a network contribution disparity between 'good' peers and 'leeches' can motivate some users to commit severe abuses that can ultimately bring down the network (if too many users become abusive): namely, a peer with 'good' connectivity might just decide it doesn't want to serve the network (by providing [bandwidth-consuming] relaying services to the unfortunate 'leeches'), and in order to get away with this unfair behavior all it has to do is to misrepresent its 'good' internet connection as being a 'leech' connection: once successful in misrepresented itself on the network as 'leech', it will not be requested to provide [relaying] services on the network.

So the problem can now be stated as follows:
how can an open-protocol P2P network be protected against hacked malicious clients which, because the network protocol is open, can be crafted in such a way that they will fully obey the network protocol syntax (and thus will be indistinguishable from genuine clients based solely on their behavior), but they will falsely claim to have 'leech'-type of internet connections that prevent them from actively contributing to the network. In brief, said malicious clients will unfairly use other peers' bandwidth when they'll need it, but will not provide [any] bandwidth of their own to the other peers when they'll be requested to do so, and they will get away with it by falsely claiming that they are sitting behind a problematic type of internet connection which prevents them from being cooperative contributors to the network (when in truth they are purposefully misrepresenting their internet connection's capabilities in order to make unfair use of the network).

The standard solution (which cannot be used):
The standard solution to the problem described above is to make sure that all the peers in the network are running a digitally-signed client program, which client program is a known-good version that a central authority distributes to the peers. However, once we dive into the details of how such a solution can be implemented we get into trouble: specifically, digitally-signed clients cannot be used in the P2P OS ecosystem because this would imply the existence of an [uncompromised] signature-validation DRM running on the peers' computers, which we cannot assume, because if we would make such an assumption we would only shift the problem of “how do we prevent compromised peers” to “how do we prevent compromised DRMs”, i.e. we'd only get right back to square one

A saving idea? (go or no-go, not sure yet):
A new way of protecting a known-good system configuration is the talk of the town these days, namely the "moving target defense" (a.k.a. MTD) [class of] solutions (apparently this concept - as opposed to the underlying techniques - is so new that it didn't even make it in wikipedia at the time i'm writing this), and for the specific case of the P2P network problem as i stated it above (i.e. resilience to maliciously crafted lying peers) the MTD translates into the following:
  1. have a central authority that periodically changes the communication protocol's syntax, then creates a new version of the client program which complies with the new protocol, and finally it broadcasts the new [known-good] version of the client program on the P2P network; in this way, the protocol change will immediately prevent ALL old clients, including the compromised ones, to log onto the network, and will require each peer to get the new [known-good] version of the client program as distributed by the central authority (i.e. all the maliciously-crafted compromised clients are effectively eliminated from the network immediately after each protocol change)
  2. the protocol changes that are implemented in each new version of the client program will be deeply OBFUSCATED in the client program object code (using all the code obfuscation tricks in book), with the goal of delaying any [theoretically possible] successful reverse engineering of the new protocol beyond the release of the next protocol update and thus render the [potentially cracked] older protocol(s) unusable on the network 
  3. the protocol obfuscator must be automatic and must itself be an open source program, where the only secret component (upon which the entire system security scheme relies on) must be the specific [random] strategy that the obfuscator elects to use as it releases each new version of obfuscated clients
As a result, after each protocol update the P2P network will only host known-good versions of clients, and by the time when any protocol reverse engineering effort might be successful, a new protocol update will already have been released, thus preventing any prior-to-the-update reverse-engineered clients to log onto the network.

The work ahead:
As it can be seen from the above description, the dynamic protocol update solution relies on the ability to create and distribute obfuscated program versions at a higher rate than an attacker's ability to create a malicious reverse engineered version of the program. Thus, given a system that uses the dynamic protocol adjustment method (as described above), the network integrity protection problem translates into the following problem:
[how] can a protocol be obfuscated such that the [theoretical] time necessary to crack the obfuscated code, given a known set of resources, exceeds a predefined limit?
Should the protocol obfuscation problem have a solution (probably underpinned by dynamic code obfuscation techniques) then the problem is solved (and i won't mind if it will be an empirical solution for as long as it proves viable in the real world) - so this is what i'm trying to find out now.

PS
A few articles on/related to code and protocol obfuscation:

Update
I also started a discussion on code obfuscation on comp.compilers, feel free to join here: http://groups.google.com/forum/#!topic/comp.compilers/ozGK36DRtw8

Sunday, June 2, 2013

Striving for perfection

I ran into yet another unexpected roadblock (pretty nasty stuff btw), i's workin' on it, but ain't gonna whine about all this just yet, so let's take a break for a moment (pun intended :P) and peek at the pros for a change (it's Sunday, what the heck!)

Tuesday, March 19, 2013

A milestone year in the standardization of computing

The time for staring work on the production-quality P2P OS is getting nearer by the day, so i thought it's high-time to look around a bit and try to get a clear picture of what tools and technologies are out there for building true cross-platform applications nowadays, and much to my delight here's what i found:
  1. we now have native threads built right into C++11, so there's no need for third-party libs (with God knows what kinds of portability problems each) any more
  2. since the days of HTML4 already, one could build a full-featured GUI right onto a web page and have it rendered in any compliant browser (Wt is a pretty neat showcase for what can be done this way), except only for the multimedia features which were not included in the HTML4 standard; now the W3C is making really quick progress towards baking exactly these capabilities into the HTML5 standard (via the free VP8 format), plus device I/O and persistent storage (all of theses especially driven by WebRTC's needs), such that once work will have been completed on these issues a standard HTML5 web page will be able to deliver a full desktop-like experience right inside a web browser (including smooth 2D graphics editing and animation via SVG and high-quality 3D effects via WebGL)
  3. last but not least, Qt is making steady progress towards becoming a usable C++ cross-platform development tool not only for all the major desktop OSes Win/Mac/Lin, but also for iOS, Android, and BB10
So at the end of the day, the world we live in looks a bit like this to me:

Back to the future: the good old tried and tested X client/server model anyone?

Now, while things look pretty neat the way they are already, how about going one step further? Namely, consider you decide to write your apps' UI based on a minimalist restriction of what HTML5 has to offer, e.g. you'll only use a limited set of widgets (say buttons, drop-down lists, and check-boxes), you'll use only an [editable] text area for all your text interactions, a canvas and an image renderer for graphics, and a simple file system API (well, this is prolly "too minimalistic" a set of UI functions, but i'm picking it up here just to get my message across); in this case, once your app is written based solely on (such) a subset of functions, what you'll have at the end of the day is a native app which will require a UI renderer with a very limited set of capabilities, such that not only you can use a standard HTML5 renderer on systems where it is available, but you can also build a minimalist UI renderer of your own (in native code) on systems that do not provide an HTML renderer: more specifically, where ever you'll find a POSIX(-like) system it's very likely you'll be able to get a port of WebKit (or even a full-fledged browser) for a pretty decent price (if not for free altogether), but if you are about to build an entirely new system of your own, then porting the Linux kernel (or writing a new fully POSIX-compliant kernel of your own) just in order to get WebKit running would be a titanic work that no ordinary small(ish) company can put up with all by itself.

And this is where the minimalist UI model comes into play: if you can have the GNU toolchain ported on your system (which should be pretty easy stuff, especially if you use an existing processor architecture - e.g. there's the OpenRISC out there for grabbing, and it comes with GNU toolchain support and all right out of the box), then all you'll have to do is to implement your small(ish) UI spec (e.g. in C++) and compile it with your gcc, and there you go, you'll have your apps truly cross-platform, ready to be deployed as-is both on any (minimalist) systems that implement your UI spec, and also on any systems that have an HTML5 browser:

The no-brainer solution: have your apps truly cross-platform-ready with a minimalist UI spec

Well, you tell me if i am wrong, but in my view (and given all of the above) this year may well become the most significant milestone in the evolution of standards-based programming after the standardization of ANSI C back in 1969.

PS
Apparently we'll still have to cope with writing BSD sockets wrappers for a while until/if they'll eventually be included in the C++ stdlib, but quite frankly that's a pretty trivial piece of code (given that we have standardized multi-threading baked inside C++) and not much more then a residual problem nowadays.

Thursday, March 14, 2013

Major breakthrough, project back on tracks

After over a year of crunching the IPv4 CGN traversal problem at the back of my mind, it finally clicked! Or, more appropriately called, it banged!
In fact, this click (or bang, or whatever else i should call it) is such a major breakthrough, with such immense potential implications, that i have to refrain from saying much about it on this blog before filing a provisional patent; but what i can say though is that i'm 99% confident i found a novel algorithm that can break through all the IPv4 CGN types that are out there in the wild - and this is not just in theory, i actually tested it for over two weeks on all the mobile connections that i could find, on multiple networks, in eight countries around the world (with a few more tests pending at the time of writing). It did take me three weeks to refine the algorithm down to its current details (and it was quite a bumpy ride), but the bottom line is that i now have a working solution for full P2P/IPv4 connectivity, down to the most minute details, so that i no longer have to wait for God knows how many more months (or years?) to see how the IPv6 dust will eventually settle; that is, i can start working on the production-quality implementation of P2P OS right now.

So the next step: full-throttle fund raising campaign for developing the release-version of P2P OS (i.e. with youtube promo, kickstarter project, call-a-friend, and whatever else it will take to get them wheels turning). The stars aligned, the time has come, let the fun begin!

PS
About the patent thingie, that's just for playing safe, be cool :) - this project will be open source after all ("networking for the masses", remember?).

Wednesday, February 20, 2013

WebRTC starts flexing its muscles

During the past two years since i started working on P2P OS there has been some significant progress on the WebRTC project which two years ago looked more like a statement of intent than anything else; and since WebRTC is backed by big players like Google, Mozilla, and Opera (with Microsoft notably missing from this lineup after it flushed $8 billion down the drain last year on Skype and it's now pitifully crying foul and trying to screw things up again), it might eventually turn into a viable P2P solution that could make P2P OS rather redundant. However, while a side-by-side comparison between P2P OS and WebRTC does have some merit, i think the case for which of the two will come on top (in terms of quintessential project goals) is far from being settled at this point in time, mainly because WebRTC pays little to no attention to properly dealing with the very real (and critical) problems that the real-world internet topology of today and tomorrow pose to P2P connectivity: in a nutshell, WebRTC opted for a conservative SIP-like technology wherein it falls back to using a network of dedicated TURN relay servers whenever a direct P2P connection between two nodes cannot be made, with little consideration to the fact that such a relay server network requires some big-pockets "sponsors" that can throw in enough ca$h to keep it up and running (e.g. Viber pumps some $2.5 million a year to keep its rather small-ish network up and running), and i think it's very likely that users will be forced to have/get a Google/Mozilla/Opera/M$/whatever account in order to use the service.

Alternatively, P2P OS aims at creating a self-reliant P2P network which is meant by design to gracefully navigate both the rough waters of the current IPv4 exhaustion and IPv4-to-IPv6 transition, and the promised shiny days of a P2P-friendly IPv6-only world of the decades ahead. Also, the scope of P2P OS is not restricted to point-to-point communication between nodes; instead, its design goal is to provide a generic foundation for content-centric networking (a.k.a. named data) where point-to-point communication is only one of many use case scenarios.

To rise, while i can see WebRTC as a serious potential competitor to P2P OS, i think abandoning P2P OS because of WebRTC would be premature at this point in time.

Sunday, December 23, 2012

P2P networking during the IPv6 transition and beyond

After getting some fresh air a couple of weeks ago, i started to think about what kind of internet connection types are to be expected during the IPv4-to-IPv6 transition period (based on this year's developments and apparent trends in IPv6 and IPv4 CGN deployments), about how long this transition period might be, and then i pondered if/how can this transition period be accommodated by a pure P2P network topology. This post is about what i came up with.

First, the time frame: i divided the transition period into a first stage 2012 -> 2020 during which the adoption of IPv6 will increase to reach some 40% market share, while in parallel most IPv6 deployments will be accompanied by either a legacy routable IPv4 address or by a new private IPv4 address delivered by the ISP via CGN (note: it is possible, even likely, that mobile IPv6 internet connectivity will lag the cable IPv6 deployments, such that only private IPv4 CGN-based connectivity will be provided on mobile networks for quite a few years to come); then, a second stage of the IPv6 transition i guesstimated it to span the 2020 -> 2030 time frame, during which IPv6 will incrementally become the norm, while routable IPv4 will most likely almost disappear, and also private CGN-based IPv4 addresses might be phased out.

The connection types for the two-stage transition scenario mentioned above are illustrated below, with guesstimated percentages (market share) for the three time points 2012, 2020, 2030 included in parenthesis next to each connection type:

  • the top half of the diagram above "IPv4 host" illustrates the types of connections that will be available for IPv4-only hosts during the transition period
  • the bottom half of the diagram "IPv4+IPv6 dual stack host" illustrates the types of connection that will be available for dual-stack hosts (i.e. for the hosts that will be connected to the internet via both an IPv4-type connection and an IPv6-type connection)
  • the total of ~15% IPv4 CGN-based connections listed for 2012 in the diagram above mostly account for 3G/4G mobile connections (these connections almost never provide a Public IPv4 to the end user)

In order to analyze how two given hosts can establish a P2P connection given the transition scenario described in the diagram above, we will consider all the possible variants for a pair of hosts with one host being the connection initiator and the other host being the responder. Based on this convention, the diagrams that follow in this post will specify how a P2P connection can be made depending on the type of connection that each of the two hosts has to the internet.

To make things easier to understand, let us start with an example diagram that illustrates how an IPv4-only host with a public IPv4 address initiates a connection to another IPv4-only host which sits behind an Endpoint-Independent Mapping NAT device: specifically, the diagram below shows that a direct link can be established between the initiator and the responder (this link is illustrated in the diagram by the blue arrow), i.e. in this specific example the P2P connection does not need to pass through a relaying peer, and thus no assistance is needed from other peers for establishing the connection:


And now let us look at all the possible P2P connection variants, grouped into "connection classes" diagrams.

For starters, the next diagram shows the types of internet connections that the initiator and responder must have in order for them to be able to establish a direct IPv4 P2P connection (i.e. without requiring a relaying peer):

  • for example, the last connection type in the diagram above (counting on the initiator side) links an initiator peer with a Large Port Set CGN-based [IPv4] connection to a responding peer that has a Public IPv4 address (or UPnP)

A second class of IPv4 connections is illustrated in the diagram below, where the types of [IPv4] connections that the initiator and responder have to the internet require the assistance of a third peer for relaying their communications, with said third-party relaying peer being connected to the internet via an Endpoint-Independent Mapping [IPv4] NAT router.
  • note: the EIM NAT44 relaying peer in the diagram below could also be a Public IPv4 (or UPnP) peer, but Public IPv4 connections are scarce and becoming scarcer by the day (while EIM NAT44 connections are plentiful), such that the Public IPv4 peers have been reserved (in this analysis) for relaying other kinds of IPv4 P2P connections that cannot use EIM NAT44-based relays (the P2P connections which strictly require a Public IPv4 relay are discussed in the next paragraph)
  • for example, the last connection type in the diagram above (counting on the initiator side) links an initiator peer with a Small Port Set CGN-based [IPv4] connection to a responding peer that sits behind an Endpoint-Dependent Mapping [IPv4] NAT with incremental port allocation

The next class of IPv4 connections shown in the diagram below also requires a third peer for relaying the communication between the initiating and responding peers, but in these cases said third-party relaying peer has to have a Public IPv4 (or UPnP) connection to the internet:

  • for example, the first connection type in the diagram above (counting on the initiator side) links an initiator peer sitting behind an Endpoint-Independent [IPv4] NAT with a responding peer that sits behind an Endpoint-Dependent Mapping [IPv4] NAT with random port allocation

Let us now look at the situations where one peer is IPv4-only and the other peer is dual-stack: in this case, a P2P connection between said types of peers will require an IPv4+IPv6 dual-stack relaying peer, where said relaying peer will not only relay the data between the initiator and the responder, but it will also perform a protocol translation:

  • the red dashed lines: if one (or both) of the peers illustrated in the diagram above as IPv4+IPv6 dual-stack does not actually have IPv6 connectivity, then connecting such two peers will require the assistance of a Public IP (or UPnP) relaying peer (represented by the red dashed line)

The last class of possible connections is illustrated in the diagram below, where both peers are dual stack: in this case a direct P2P link can be established over IPv6 between the two peers, without the assistance of a relaying peer:

  • the dashed lines: if one (or both) of the peers illustrated in the diagram above as IPv4+IPv6 dual-stack does not actually have IPv6 connectivity, then, depending on the kinds of IPv4 connections the two peers have to the internet, the connection between such two peers can be established either as a direct IPv4 P2P connection (this is the case of the blue dashed line above, when both peers are connected via an Endpoint-Independent CGN), or the connection will require the assistance of a Public IP (or UPnP) relaying peer (these situations are represented by the red dashed lines) or an Endpoint-Independent NAT relaying peer (these situations are represented by the green dashed lines)
  • as it is obvious from the diagram above, once the transition to IPv6 will be completed, all P2P connections will be feasible as direct IPv6 connections, i.e. they will not require a third-party peer for relaying the P2P communication.

Finally, a couple of important remarks on the dashed-lines connections illustrated in the diagrams above:
  1. the dashed lines connections are to be used only in those situations when the solid line connections cannot be used; for example, a quite plausible situation when the dashed lines connections might have to be used is in the case of mobile 3G/4G connections which may only provide IPv4 CGN-based connectivity for a few more years to come (before finally adding IPv6 support)
  2. the biggest vulnerability of the P2P topology presented in this post are the *red* dashed lines connections in the last two diagrams because they use a Public IPv4 (or UPnP) relaying peer, and such peers are an already scarce resource in today's internet topology, only to become scarcer during the IPv4-to-IPv6 transition; specifically, one can think of a pessimistic scenario where 3G/4G mobile connections will reach some 70% market share (sometime during 2012 -> 2020) while only implementing IPv4 CGN-based connectivity (i.e. without IPv6), and by that time the availability of Public IPv4 peers may well be reduced to some 5% (or less) market share: in this case, the red dashed lines connection in the two diagrams above may well account for some 50% of all the possible P2P connection routes, and with only 5% relays available in the network there will be about one Public IPv4 relaying peer online for each 10+ P2P live connections that will require such a relay, which will result in such a heavy strain on the Public IPv4 relays that said relays might have to throttle their relaying speed and only allow for low-traffic relaying services during periods of network congestion (e.g. they will probably still be able to relay text messaging, slow-rate file transfer, and maybe voice, but they will most likely have to degrade/deny their relaying service for video streaming, desktop sharing, or other high-traffic services)

In conclusion, now that i'm done with this deluge of diagrams and geeky explanations, here's my 2 cents in a nutshell: should the IPv6 and CGN deployments stay the right course, and, sure enough, if i'm not missing some show-stopper somethin' somethin', the point to take home from this post is that it currently seems the tide has turned from exactly one year ago and P2P networking now seems feasible for both during, and after, the IPv4-to-IPv6 transition period, with the caveat that some serious challenges still exist, mostly related to how IPv6 and CGNs will be deployed on the 3G/4G mobile networks, and i don't yet have sufficient data to guesstimate how serious said challenges are.

In other news:
Qt5 is out, with my stalking here: https://bugreports.qt-project.org/browse/QTBUG-28461

Wednesday, December 5, 2012

IPv6 picking up steam, and it's apparently done right

I've been touring the net for the past week to check out what the current state of affairs is with IPv6 implementations and deployments to the end users, and what i found is very encouraging, at least for now: namely, all the major ISPs i've checked on the net (Free.fr, Telefonica, Deutsche Telecom, Swisscom, Comcast, AT&T, TWC, DoCoMo, and even my home country's RCS-RDS network) seem to provide end users with at least a proper, P2P-friendly /64 address block (instead of one single /128 address, and then leave the users to cope with it by using all sorts of non-standardized NAT66 boxes).

Since the two main reasons for putting this project on hold one year ago were the unknowns related to IPv6 deployment and the uncertainties plaguing the future of Qt, and since significant progress is apparently being made in the right direction on both of these issues, at this point in time i can see at least some reasons for renewed hope on this project's future.

Some of the IPv6 deployment links i've checked:
...and some live stats by ISPs here: http://www.worldipv6launch.org/measurements/
So what i'll probably do next is write a small IPv6 tester application to effectively check the IPv6 deployments as they are in the real world; once i'll do this i'll post the results.

PS
I haven't been able to find any reliable stats on 3G/4G mobile IPv6 deployment trends just yet, but i'll keep looking. And the same holds for mobile CGN-based IPv4 connections: this also needs some further investigation.

Thursday, September 20, 2012

Qt resurrected, support planed for Android and iOS

This might be really big: http://qt.digia.com/About-us/News/Digias-Completion-Of-Acquisition-Paves-Way-For-Qt-To-Become-Worlds-Leading-Cross-Platform-Development-Framework/
Helsinki, Finland and Santa Clara, US - September 18 2012, Digia (NASDAQ OMX Helsinki: DIG1V) today announced that it has completed the acquisition of the Qt software technologies and Qt business.
Qt already runs on leading desktop, embedded and real-time operating systems, as well as on a number of mobile platforms. Digia has initiated projects to deliver full support for Android and iOS mobile and tablet operating systems within Qt and will present the product roadmap and strategy later in the fall.
All significant existing online Qt information has been transferred to qt.digia.com, which, along with qt-project.org, is now the main information-sharing site for Qt.
Now add to the above the cut-them-chains-and-break-free move of Qt5 to QPA (a nice tour at Dr. Dobb's here), and Qt might just have set itself on course for becoming a reliable and future-proof platform. But then again, there's still allot of work to do to make Qt truly platform agnostic - namely, QPA must become a full platform abstraction layer instead of focusing only on the display server, and there's no clearly stated plan in this direction on Qt's website, so only time will tell...

PS
I can't help wondering if Digia's management really understands what QPA's potential implications are for their business (especially if it is to grow into a complete abstraction layer, multimedia included and all), so until i see QPA fully integrated (and properly documented) in the LGPL-ed version of Qt5 i'll just call myself "cautiously optimistic" on this rather extreme openness move...

Thursday, June 21, 2012

IPv6 lauched: "this time it's for real"

Under the slogan "this time it is for real", IPv6 was officially launched on 06.06.2012. What will follow in the next couple of years, namely the way in which the world's major ISPs will deploy IPv6 on terrestrial and mobile networks to the end customer, will determine the future fate of the internet: to P2P or not to P2P.

At this point in time there are some encouraging signs in that ISPs engaged in the IPv6 transition seem to follow the new standard's guideline of providing /64 (or better) prefixes to the landline users and (at least) a proper /128 address to mobile devices, but nothing can be thought of as cast in iron at this early stage of deployment; only time will tell.

One very encouraging thing is that the IETF seems to manage to control the open-standards game, and the "IPv6 ready" logo is an extraordinarily powerful tool for steering CPE manufacturers to play the game properly (i.e. plain simple prefix delegation, without one-to-many NAT66 improvisations on their devices).

Maybe there still is hope.

Monday, December 12, 2011

Wrapping things up with a call for partners

After almost a year of hard work, i eventually realized that the P2P OS project is facing some very serious threats which i just don't feel capable of negotiating all by myself, so i decided to wrap things up with this "Call for partners" post. In a nutshell, here's a super-condensed list of the key things that might be of interest for a potential partner to this project:
  • what i managed to do so far is to build a proff-of-concept skype alternative; in fact, the P2P OS networking algorithm has been deigned to allow for a much more robust P2P network than skype's, in the sense that while skype requires a large number of computers with direct internet connection (i.e. public IP or single-router UPnP) in order to support its network (about one such computer for ~100 registered users), P2P OS does not have any such restrictions (i.e. the P2P OS network is self-supporting even if all the computers in the network are running behind routers and/or firewalls).
  • i believe this project can become a skype killer because of some key differentiators: it's open-source, it's not collecting any user data, and it consequently can be deployed (and replace skype) in corporate environments
    • important: this project is intended to be released under some form of open source license primarily for allowing code inspection and for making it free for home users; however, the program can be released under a dual license, with a commercial license for business users
  • i believe money can be made from this project if it is finalized, e.g. by monetizing it as a product (this can be particularly interesting for companies wishing to host their own private network), building services around it (there's a rather large set of options here, including per-user subscriptions for guaranteed uptime), third-party apps advertising on the P2P OS home page (sort of a marketplace, but only for hosting the apps' ads, i.e. without actually hosting the apps themselves), advertized sponsorships, etc, and of course, as a last resort, selling the project copyrights (potential buyers include open source-based and open source-firendly companies).
However, there are at least two major issues with this project and, as far as i can see, with any P2P-based application that someone would want to develop these days, and these two issues are also the sole reason for which i decided to put this project on hold:
  • catch 1: the current version of P2P OS has been designed to circumvent the P2P connectivity issues specific to IPv4 in a NAT-based home network environment, but it does not implement provisions for supporting carrier-grade NAT (CGN), and, more importantly, porting the project to IPv6 might not be feasible at all. More specifically, whether any P2P application will be possible on the next-generation IPv6 networks depends on how IPv6 residential gateway devices (RG) will be implemented, and i can't find any way to guestimate, let alone negotiate, this threat because of the very large numbers of factors involved (e.g. whether the high-profile internet companies such as MS, google, or yahoo will lobby for P2P support or not, whether the large ISPs will consider P2P connectivity as being important to the end users or not, whether the present-day high-profile P2P applications such as skype or yahoo will keep their current P2P model or they will transition to a relays-only model - said applications already use relays whenever one user is on a 3G network, etc)
  • catch 2: when i started this project i assumed an OS landscape that will allow easy porting and deployment via Qt on all three major desktop OSes Windows, OS X, Linux, plus on at least two major mobile OSes Android and Nokia's MeeGo. However, things have changed dramatically in the sense that (1) porting to the new Windows 8 mobile will be a tough nut to crack (to say the least) because of the limited WinRT API it offers by sandboxing the applications, (2) Nokia abandoned its efforts with MeeGo (so no Nokia port), and (3) Qt has been spun-off (read: all-but-abandoned) by Nokia such that the future of an Android port is uncertain at best; additionally, my paranoid view on the course of MS' and Apple's market strategies suggests that there is a real danger for Windows Phone/Tablet OS and iOS ports to be prohibited by these two OSes. To rise, Qt will probably still be a usable tool for porting P2P OS to the future versions of Windows Desktop, OS X, and Linux for the nest 5 years or so (i.e. porting to these OSes should be fairly easy), but the mobile and tablet ports might prove to be mission impossible.
Keeping the above in mind, should anybody be interested in learning more details about this project and its associated risks should it be continued, please drop me a line.

Update
As i gather more info on the current trends in CGN deployments (reading some articles e.g. from CISCO and Juniper on this topic and talking to industry insiders), adding full CGN support to P2P OS (or any other P2P application for that matter) increasingly seems technically impossible because of the algorithms that ISPs are enabling on their CGNs (namely, most ISPs seem to opt for what it's called EDM - endpoint dependent mapping, which maximizes the reuse of the CGN ports but breaks P2P), such that an IPv4-only P2P OS implementation will probably be impossible for the coming years when CGNs will be the IPv4 connectivity norm for the end user.

Sunday, November 20, 2011

A glimmer of hope (but not much more)

Don't want to let this blog end (or just hibernate, time will tell) on that all's doom and gloom note of the previous post, so i'll add this: there still is a slim chance for the internet to go back to its roots, i.e. with everybody having a direct IP connection, and this slim chance comes from IPv6. However, the reason i call IPv6 just a slim chance is that although IPv6 can bring back "the internet of peers" to the world, the ISPs (and, to a lesser extent, the various router vendors) can still f* this up e.g. by deciding to install default firewalls inside their residential gateway devices, in which case your average Joe will still not be able to set up his router to accept incoming connections. And no incoming connections means no P2P. And no P2P means no internet.

As i said in the previous post, at this point in time i'm incapable of assessing whether the internet is dying, or it's headed towards a rebirth (via IPv6), or whatever else, so for the time being i'll just wait. The only thing that i can think of possibly doing at this point is to somehow start an awareness campaign for including "you need to have an unrestricted public IP" in the very definition of "internet connection"; any other type of connection should simply not be called "internet connection". Maybe i'll make a clip on youtube about this if/when i'll find the time and energy to do it...

Update
Apparently i'm not alone in my paranoia: www.isoc.org/tools/blogs/scenarios

Friday, October 14, 2011

Signing off, very likely permanently

There is some good news, some bad news, and a killer (or maybe just artificially-induced coma - this remains to be seen) conclusion.

The good news:
There seems to be a way of shouldering one billion users on a P2P network with just some $1,000/mo central server traffic: it's called Kademlia (seminal article, wikipedia, search). Kademlia essentially creates an overlay network of micro-servers which, in turn, sustain the rest of the network users, while a central server is only used to log on to the network. But there's a catch: in order to distribute micro-server services over a P2P network there have to be at least ~1% of the participating peers that can act as micro-servers, i.e. they must:
  1. be directly connected to the internet (i.e. they are not behind a NAT or firewall)
  2. have relatively stable connections, i.e. they should stay connected tens of minutes after they completed a network operation (e.g. a chat session, a file transfer, etc)
Both of the above conditions are easily met in today's internet topology: (1) is achieved by any peer that has maximum one router and the router is UPnP-enabled, and (2) is achievable by having the P2P application remain running in the background for a certain amount of time after a specific p2p session ends, or by making the sessions themselves last a relatively long time. BTW, the most successful p2p applications currently available - Skype and BitTorrent - both do their best to enforce the above two conditions upon all the nodes on which it they are installed (by remaining online even after you click their 'close' button, and by turning on UPnP by default).

The bad news:
In brief, the bad news is that the good news don't do me much good. Because:
  • a) i no longer trust that (1) will be maintained in the future. People increasingly install cascaded routers in their homes, and UPnP does not give any signs whatsoever that it is willing to address this issue (i.e. make a computer directly accessible on the internet when connected through cascaded routers). Furthermore, ISPs can deploy new methods of preventing p2p applications from running at any time (and i don't mean filters, but rather generic methods such as CGNs and firewalls, and i really don't think this is very far fetched)
  • b) the new mobile communications paradigm will become an increasingly significant burden that a depleting pool of directly-connected peers will have to deal with (because mobile data plans are always offered through operator NATs - but, even if they weren't, a mobile peer can't be used a a server because of mobile traffic costs). And BTW, i have this feeling that one of the main reasons for which skype was sold was the realization that the pool of "supernodes" (skype's terminology for shamelessly using people's bandwidth and making $8 billions out of this scheme) is depleting and some big-pockets company will eventually need to step in and shoulder the network with dedicated servers of their own (sure, that's just a hunch, but it will be interesting to see if/when that "use UPnP" checkbox will go away from skype's connection settings, cause that'd pretty much say that skype fully transitioned to in-house supernodes)

    Update
    It happened: arstechnica.com/business/news/2012/05/skype-replaces-p2p-supernodes-with-linux-boxes-hosted-by-microsoft.ars (and never mind the M$ lady's reply, it's damage control nonsense)

My killer conclusion:
At this point in time i'm incapable of evaluating just how serious the above challenges (1) & (2) are, let alone other unforeseen issues that might creep in; and since implementing Kademlia would require anywhere from 6 moths to one year of hard work, i'm just not ready to plunge into such an effort alone and empty-handed.

So i stop. Very likely for good.

    Saturday, October 8, 2011

    Roadblock

    Mea culpa for not updating this blog in quite a while, but it's not because i slowed down work on it, but rather, much worse, i've hit a roadblock which is pretty darn serious. In brief, when trying to write the algorithm that would enable a p2pOS client to act as a relay for its plugins (i.e. to "connect the crossed red lines" that i talked about in a previous post), i also had to try to define the low-level API that the plugins will be using to connect to one another (by means of their associated p2pOS clients that will be acting as the relays, one relay at each end of a P2P connection). And as one thing led to another, i eventually reached the conclusion that what i need to do first is to establish exactly how a peer joins the network and connects to another peer. And this is where things got really, really ugly.

    Without entering into too many technical details, the important point here is that in order to have two peers connect to each other, they have to go through an initial "handshaking phase" during which the two peers learn some essential things about one another (e.g. their IP addresses, what kind of router(s)/firewall(s) they are behind, etc), and this handshaking phase has to be negotiated through a dedicated handshaking server. Well,  i can try to hide behind all sorts of technical arguments, but that fact of the matter is that ever since i started this project i never tried to calculate exactly how much traffic such a central handshaking server would require for a large P2P network (i'm talking about 100,000,000...1,000,000,000 users being online), only to find out now that the numbers are astronomical: namely, we're talking about thousands, or even tens of thousands, of terrabytes/month, which translates into a handshaking server operating cost somewhere in the hundreds of thousand, maybe millions, of dollars a month. This in turn means that hosting such a server is not something that just about any punk can do in his basement, which in turn means a large company would be required to finance the network operation. Or, in layman's terms: the network can never be truly open, no matter what license will be covering this project, no matter what verbal commitments a company would make, etc. And since the (maybe only) non-negotiable objective of this project is to create a truly open P2P platform, well... you guessed it: i'm stuck.

    But, as i said at the beginning of this post, all this mess doesn't mean i gave up on the project; in fact, because it seems increasingly likely that some sort of distributed handshaking algorithm will be necessary, i made quite a few tweaks in the program in order to reduce the traffic between connected peers (e.g. i managed to reduce the P2P keep-alive traffic by about an order of magnitude by detecting the peers' routers' port timeouts and only send keep-alive messages at the required rate), i refined the router classes such that over 90% of the routers models can now act as relays, and i introduced an algorithm that detects if a peer is directly connected to the internet (i.e. public IP or UPnP) such that it can serve as handshaking server in the network. This is how the new "Settings" panel looks like now:


    So what i'm doing right now is study what other smart-@$$ P2P projects have done (e.g. GnutellaFreenet, etc), i'm trying to learn about various DHT approaches (there's a very nice tutorial talking about the basics here), etc, and i'll see if i'll be able to come up with a solution. Keep ya fingers crossed for me, it's in the world's best interest and sh*t :)

    PS
    Here's how a p2pOS-based P2P session is established through NAT routers with the help of a handshaking server: the blue messages are P2P messages, while the green messages are relayed via the handshaking server (once the handshaking phase is completed, all messages from one peer go directly to the other, i.e. they are P2P messages):