Finish SSH apps post

This commit is contained in:
Bad Manners 2024-09-23 13:16:43 -03:00
parent 402698e9ca
commit 4043a2dad9
No known key found for this signature in database
GPG key ID: 8C88292CCB075609
132 changed files with 168 additions and 134 deletions

View file

@ -1,11 +1,10 @@
---
slug: supercharged-ssh-apps-on-sish
title: Supercharged SSH applications on sish
# pubDate: 2024-09-23
isDraft: true
pubDate: 2024-09-23
isAgeRestricted: false
authors: bad-manners
# thumbnail: /src/assets/thumbnails/drafts/ssh_all_the_way_down.png
thumbnail: /src/assets/thumbnails/blog/supercharged_ssh_apps_on_sish.png
description: |
After discovering the joys of a reverse port forwarding-based architecture, I dig even deeper into SSH itself to make the most of it.
prev: ssh-all-the-way-down
@ -26,7 +25,7 @@ import imageRusshAxum from "@assets/images/ssh/russh_axum.png";
import imageCheckboxes from "@assets/images/ssh/checkboxes.png";
import imageMultipaintByNumbers from "@assets/images/ssh/multipaint_by_numbers.png";
This is my second post investigating SSH, learning all that it has to offer.
This is my second post investigating SSH, and learning what it has to offer.
<TocMdx headings={getHeadings()} />
@ -39,7 +38,10 @@ In my [last post](/blog/ssh-all-the-way-down), I went over a saga of trying to s
src={imageSishPublic}
alt="Diagram entitled 'sish public', showing that Eric's machine with a service exposed on localhost port 3000 connects to sish via the command (ssh -R eric:80:localhost:3000 tuns.sh). This creates a bi-directional tunnel and exposes https://eric.tuns.sh to the Internet, which Tony accesses from a separate device."
/>
<figcaption>With a simple reverse shell command, we can expose anything to the Internet. © Antonio Mika</figcaption>
<figcaption>
With a simple reverse shell command, and a configured sish instance, we can expose anything to the Internet. ©
Antonio Mika
</figcaption>
</figure>
In fact, we can host anything TCP-based as long as we can create a secure shell to sish, even if it's running on the same host machine.
@ -60,7 +62,7 @@ In a way, this is a two-fold solution. sish provides us with:
1. A [reverse proxy](https://en.wikipedia.org/wiki/Reverse_proxy), which will handle and route any incoming traffic to the correct applications.
2. A [hole-punching technique](<https://en.wikipedia.org/wiki/Hole_punching_(networking)>), letting us overcome any limitations that NAT imposes.
But as of now, our applications all look something like this (in [Docker Compose](https://docs.docker.com/compose/) configuration):
But as of now, all of our applications look something like this (in [Docker Compose](https://docs.docker.com/compose/) configuration):
<figure>
@ -95,7 +97,11 @@ services:
<figcaption>A basic server connecting to sish via a persistent reverse SSH tunnel.</figcaption>
</figure>
It makes sense to have this separation into two images, if our application only uses an HTTP socket and isn't aware of the SSH tunneling shenanigans going on... which is most of the applications. But if we build our _own_ application, we could interact directly with SSH instead. In that case, how deep can we really integrate it with the existing architecture...?
We have two images running for our application. One (`server`) is the actual webserver, while the other (`autosish`) handles the reverse port forwarding for us.
It makes sense to have this separation into two images, if our application only uses an HTTP socket, and if it isn't aware of the SSH tunneling shenanigans going on... which is most of the applications.
But if we build our _own_ application, we could interact directly with SSH instead. In that case, how deep can we really integrate it with the existing architecture...?
In this post, we will work on a tunnel-aware application, and finding out more about the SSH protocol. Be forewarned that there will be plenty of [Rust](https://www.rust-lang.org/) code ahead!
@ -105,7 +111,9 @@ But first of all, how does remote port forwarding through an SSH tunnel work?
Better yet, how does _an SSH tunnel_ even work?
In the previous post, I explained that SSH is an [application-layer protocol](https://en.wikipedia.org/wiki/Application_layer) that runs on top of TCP. Our SSH server is listening on a port (usually 22) and we are able to connect to it. It has its own protocols and implementations, but as long as it's implemented properly by a library, we are able to connect without the default SSH client.
In the previous post, I explained that SSH is an [application-layer protocol](https://en.wikipedia.org/wiki/Application_layer) that runs on top of TCP. Our SSH server is listening on a port (usually 22) and we are able to connect to it.
SSH has its own protocols and implementations as expected, but so long as it's implemented properly by a library, we should be able to connect without the default SSH client.
<figure>
<Image
@ -153,9 +161,9 @@ async fn main() -> Result<()> {
}
```
If you're unfamiliar with Rust, this might be a lot at once. In summary, we're just importing some dependencies at the top, and at the bottom, inside of `fn main()`, we're setting up a client connection with `client::connect()`.
If you're unfamiliar with Rust, this might be a lot at once. In summary, we're doing two things: at the top, we import any dependencies we need; and at the bottom, inside of `async fn main()`, we're setting up a client connection with `client::connect()`.
Aside from this code, we also need to define the `Client` struct, which will be responsible for answering messages created by the server. This will implement the `russh::client::Handler` async trait, responsible for linking our methods to the ones that the library knows to call.
Aside from this code, we also need to define the `Client` struct, which will be responsible for answering messages created by the server. This will implement the `russh::client::Handler` async trait, responsible for exposing our user-defined methods to the ones that the library knows to call.
```rust
use async_trait::async_trait;
@ -183,21 +191,21 @@ With these two pieces, we can try running our program with cargo like this:
cargo run -- -R test -p 80 -i ~/.ssh/id_ed25519 sish.top
```
We're using an argument parser to read the flags, [clap](https://docs.rs/clap/latest/clap/). The way it's set up, you can read this as being equivalent to the other command that we've seen so far:
We're using [clap](https://docs.rs/clap/latest/clap/), an argument parser, to read the flags that are passed. The way it's set up, you can read this as being equivalent to the other command that we've seen so far:
```sh
ssh -R test:80:localhost:???? -i ~/.ssh/id_ed25519 sish.top
```
(Notice that we no longer specify the port in the localhost; we'll get to that later.)
(Notice that we no longer specify the local port to forward remote connections to; we'll get to that later.)
When we run this, it prints `Connected.` on our terminal, then immediately exits the program.
At least we're doing...not nothing.
You might've realized that we aren't actually doing anything with the connection when we create it. The `client::connect()` function simply establishes the TCP socket and checks for the server's key fingerprint, through the single method that we've implemented on the `Client` struct.
You might've realized that the connection is being ignored right after we create it. The `client::connect()` function simply establishes the TCP socket and checks for the server's key fingerprint, through the single method that we've implemented on the `Client` struct.
As you may have guessed from the identity file being passed to the command, we'll have to use that to authenticate now. So after we create a connection, we'll immediately call `session.authenticate_publickey()` to do so via public key cryptography. I've cut the repeated portion of the program below with a `// ... snip ...`:
As you may have guessed from the identity file being passed to the command, we'll have to use that to authenticate now. So after we create a connection, we'll immediately call `session.authenticate_publickey()` to do so via public key cryptography. I've cut the repeated portion of the program below with a `snip`:
```rust
use std::path::PathBuf;
@ -211,11 +219,11 @@ async fn main() -> Result<()> {
let secret_key = fs::read_to_string(args.identity_file)
.await
.with_context(|| "Failed to open identity file")?;
let secret_key =
Arc::new(decode_secret_key(&secret_key, None).with_context(|| "Invalid secret key")?);
let secret_key = decode_secret_key(&secret_key, None)
.with_context(|| "Invalid secret key")?;
if session
.authenticate_publickey(args.login_name, secret_key)
.authenticate_publickey(args.login_name, Arc::new(secret_key))
.await
.with_context(|| "Error while authenticating with public key.")?
{
@ -230,9 +238,9 @@ async fn main() -> Result<()> {
If your key is authorized with the given server, this will print `Public key authentication succeeded!` after connecting, then immediately exit again. Not a lot of progress, but bear with me.
So connecting and authenticating is straightforward enough. You might draw a parallel with connecting to a website, then logging in with your credentials. What comes after you log in is a bit more freeform, and depends on what you intend to do on the website.
So connecting and authenticating is straightforward enough. You might draw a parallel with first connecting to a website, then logging in with your credentials. What comes after you log in is a bit more freeform, and depends on what you intend to do on the website.
With SSH, the analogy still holds. After creating a session, there are many options for what we can do next ([many of them available in russh](https://docs.rs/russh/0.45.0/russh/client/struct.Session.html)):
With SSH, the analogy still holds. After creating a session, there are many options for what we can do next ([all of them available in russh](https://docs.rs/russh/0.45.0/russh/client/struct.Session.html)):
- `request_pty()`: Request that an interactive [pseudoterminal](https://en.wikipedia.org/wiki/Pseudoterminal) is created by the server, allowing us to enter commands over a remote shell session. This is the most common usecase for SSH.
- `request_x11()`: Request that an [X11 connection](https://en.wikipedia.org/wiki/X_Window_System) is displayed over the Internet. This lets us access graphical applications through SSH!
@ -255,9 +263,9 @@ async fn main() -> Result<()> {
}
```
Once again, it succeeds and binds on what we'd expect, then exits immediately. I'm seeing a pattern here...
Once again, it succeeds and binds on the provided host and port, then exits immediately. I'm seeing a pattern here...
It turns out that there's one missing piece here, and that is to create an open-session channel. We'll get into the reasons why in the next section, but for now, let's push on with some more code.
It turns out that there's one missing piece here, and that is to create an open-session channel. We'll get into the reason why in the next subsection, but for now, let's push on with some more code.
```rust
use russh::{ChannelMsg, Disconnect};
@ -271,6 +279,7 @@ async fn main() -> Result<()> {
.channel_open_session()
.await
.with_context(|| "channel_open_session error.")?;
println!("Created open session channel.");
let mut stdout = stdout();
let mut stderr = stderr();
let code = loop {
@ -309,7 +318,7 @@ async fn main() -> Result<()> {
}
```
There's a lot going on in this one. First we create a channel with `session.channel_open_session()`, then we set up a `loop` that will read every message that we eventually get through this channel (by reading it with `channel.wait().await`) and handle it appropriately. Then, we try to close the session cleanly if possible.
There's a lot going on in this one. First we create a channel with `session.channel_open_session()`, then we set up a `loop` that will read every message that we eventually get through this channel (by reading it with `channel.wait().await`). We have to handle each message type appropriately. Then, once the channel closes, we try to close the session cleanly if possible.
When we run it this time, we expect it to simply exit after printing the
@ -323,43 +332,49 @@ HTTPS: https://test.sish.top
Wait, the session is actually persisting?! And we even got some data from sish in the process!
When we created the channel through `session.channel_open_session()`, the server automatically knew that it could send information to us through it. It is just a two-way tunnel where every message is secure, so we can read data and even send some back if we want (for example, with [`stdin`](https://en.wikipedia.org/wiki/Standard_streams)).
When we created the channel through `session.channel_open_session()`, the server automatically knew that it could send us information through it. It is just a two-way tunnel where every message is secure, so we can read data, and even send some back if we want (for example, with [`stdin`](https://en.wikipedia.org/wiki/Standard_streams) if we're making a terminal application).
That's cool and all, but more important is how it gave us a URL for our service with automatic HTTPS, even! I wonder what happens if I try to open that link in my browser...
...
...It just times out with "bad gateway", causing our SSH program to exit.
...It just times out after a while with "bad gateway", causing our SSH program to exit.
Well, that's less exciting. But at least it's doing _something_. Besides, if we never touch the link that it passed us, we can stay connected indefinitely. And as soon as we open the link anywhere, we consistently get disconnected from the SSH server. That's proof that there's a relation between what the browser sees and our little program.
Well, that's less exciting. But at least it's doing _something_. Besides, if we never touch the link that it passed us, we can stay connected indefinitely. And as soon as we open the link on any device, we consistently get disconnected from the SSH server. That's proof that there's a relation between what the browser sees and our little Rust program.
The reason why we get disconnected is because we aren't handling any requests that come in. Right now, there's no way to read HTTP requests, or send HTTP responses.
The reason why we get disconnected is because we aren't handling any requests that come in. Right now, there's no way to read HTTP requests, even less sending HTTP responses.
But I thought `channel_open_session()` was already doing that? Not really it only serves to transfer messages between the client and the server. Instead, to handle each new connection, we need to use a new channel.
Sounds simple enough. Then how do we create these channels? The answer is also simple: We don't.
## Changing channels
### Changing channels
At this point, it's worth taking a detour from all of the code and explain how an SSH session actually works.
[RFC 4254](https://datatracker.ietf.org/doc/html/rfc4254) is the document that dictates how SSH connections are supposed to work on a higher level. There are some interesting details, but most importantly for us is [5 - Channel Mechanism section](https://datatracker.ietf.org/doc/html/rfc4254#section-5):
[RFC 4254](https://datatracker.ietf.org/doc/html/rfc4254) is the document that dictates how SSH connections are supposed to work on a higher level. There are some interesting details, but most importantly for us is the ["5. Channel Mechanism"](https://datatracker.ietf.org/doc/html/rfc4254#section-5) section:
> All terminal sessions, forwarded connections, etc., are channels. Either side may open a channel. Multiple channels are multiplexed into a single connection.
In other words, a connection can have multiple channels, each responsible for a part of the system. This explicitly includes forwarded connections, such as the ones we are looking for.
Specifically, in [7.1 - Request Port Forwarding section](https://datatracker.ietf.org/doc/html/rfc4254#section-7.1), the mechanism for requesting a port is specified. It's exactly what we're doing right now. In the next section, [7.2 - TCP/IP Forwarding Channels](https://datatracker.ietf.org/doc/html/rfc4254#section-7.2), the expected behavior of the server is explained:
Even more so, either side can open channels. Earlier, we opened one with `session.channel_open_session()` from the client-side, but the server is also allowed to open them if necessary.
Specifically, we see that in the ["7.1. Request Port Forwarding"](https://datatracker.ietf.org/doc/html/rfc4254#section-7.1) section, the mechanism for requesting a port is specified. It's exactly what we're doing right now, using `session.tcpip_forward()` and what not.
In the next section, ["7.2. TCP/IP Forwarding Channels"](https://datatracker.ietf.org/doc/html/rfc4254#section-7.2), the expected behavior of the server is explained:
> When a connection comes to a port for which remote forwarding has been requested, a channel is opened to forward the port to the other side.
So a new channel is being created and opened _for_ us. The channel is labeled `forwarded-tcpip`, which corresponds with the `server_channel_open_forwarded_tcpip()` method of `russh::client::Handler`. Previously, we only added a method to our `Client` struct that accepted all of the key fingerprints that the server provides, so we're gonna add a second one to handle the forwarding.
So a new channel is being created and opened _for_ us. The channel is labeled `forwarded-tcpip`, which corresponds with the `server_channel_open_forwarded_tcpip()` method of `russh::client::Handler`.
Remember, forwarded connections are channels, so we can use them just like the channel we get from `session.channel_open_session()`. And thankfully, Tokio has some utilities to make their usage trivial for our case.
Previously, that `Handler` only had one defined method by our `Client` struct, which accepted all of the key fingerprints that the server provides. So we gotta add a second one to handle any received forwarding channels.
## Back in session
Remember, forwarded connections are channels, so we can use them just like the channel we get from `session.channel_open_session()`. And thankfully, as you'll see, Tokio has some utilities to make their usage trivial for our case.
If I understood the documentation correctly, then we should be pretty close to actually getting something to the HTTP endpoint! We just need to create an HTTP server on our side to handle everything for us, then connect it to the data channel that we receive from the server.
### Back in session
If I understood the documentation correctly, then we should be pretty close to actually getting something to the HTTP endpoint! We just need to create an HTTP server on our side to handle everything for us, then connect it to the data channels that we receive from the server.
First, let's make the simplest HTTP server with global state that we can:
@ -376,25 +391,25 @@ struct AppState {
data: Arc<AtomicUsize>,
}
/// A basic example endpoint that includes shared state.
async fn hello(State(state): State<AppState>) -> String {
let request_id = state.data.fetch_add(1, Ordering::AcqRel);
format!("Hello, request #{}!", request_id)
}
/// A lazily-created Router, to be used by the SSH client tunnels.
static ROUTER: LazyLock<Router> = LazyLock::new(|| {
Router::new().route("/", get(hello)).with_state(AppState {
data: Arc::new(AtomicUsize::new(1)),
})
});
/// A basic example endpoint that includes shared state.
async fn hello(State(state): State<AppState>) -> String {
let request_id = state.data.fetch_add(1, Ordering::AcqRel);
format!("Hello, request #{}!", request_id)
}
```
If you're unfamiliar with Rust's [synchronization primitives](https://doc.rust-lang.org/std/sync/index.html), this may be a bit hard to read. But all this does is create a lazily-evaluated HTTP server that responds to each subsequent request with a global counter (starting on 1, 2, 3, and so on).
If you're unfamiliar with Rust's [synchronization primitives](https://doc.rust-lang.org/std/sync/index.html), this may be a bit hard to read. But all this does is create a lazily-evaluated HTTP server on `ROUTER` that responds to each subsequent request with a global counter (starting on 1, 2, 3, and so on).
Remember that our goal is to serve this router to any channels received through `server_channel_open_forwarded_tcpip()` on our `Client`. If we were in the C world, we'd need to reference the channel directly by its ID but in `russh`, a struct representing the channel is already given to us, abstracting that detail away.
Remember that our goal is to serve this router to any channels received through `server_channel_open_forwarded_tcpip()`. If we were in the C world, we'd need to reference the channel directly by its ID but in `russh`, a struct representing the channel is already given to us, abstracting that detail away and avoiding any mistakes on the programmer's part.
In order to connect both parts, we'll turn the provided SSH channel into a stream, then use some magic with Hyper and Tower to be able to serve HTTP responses as if that stream were a TCP socket:
In order to connect the `ROUTER` and the channel together, we'll turn the provided SSH channel into a [stream](https://tokio.rs/tokio/tutorial/streams), then use some magic with Hyper and Tower to be able to serve HTTP responses as if that stream were a TCP socket:
```rust
use hyper::service::service_fn;
@ -423,11 +438,13 @@ impl client::Handler for Client {
session: &mut Session,
) -> Result<(), Self::Error> {
let router = &*ROUTER;
let service = service_fn(move |req| router.clone().call(req));
let service = service_fn(
move |req| router.clone().call(req));
let server = Builder::new(TokioExecutor::new());
tokio::spawn(async move {
server
.serve_connection_with_upgrades(TokioIo::new(channel.into_stream()), service)
.serve_connection_with_upgrades(
TokioIo::new(channel.into_stream()), service)
.await
.expect("Invalid request");
});
@ -451,9 +468,9 @@ And, as you may have noticed, we never used a single TCP socket, other than to c
But then, why does the SSH client require that you specify a numbered port for remote forwarding if it's unnecessary? I imagine that the reason for it is convenience. It's easier to map one socket (the remote one) to another (your local one), and ends up being what you want to do in the majority of cases anyway.
However, you can see that the underlying SSH protocol is not too complicated, at least through the interfaces it exposes. We only had to write less than 200 lines of Rust code, and we're already doing things that the regular SSH client can't.
However, you can see that the underlying SSH protocol is not too complicated, at least through the interfaces it exposes. We only had to write less than 200 lines of Rust code, and we're already doing things that the regular SSH client can't alone.
In summary, this is what the code does:
To summarize, this is what the code does:
1. Starts a connection with the SSH server.
2. Authenticates via public key.
@ -465,23 +482,23 @@ If you want to see the final code, with some additional functionality for runnin
## Painting the bigger picture
But a simple HTTP server isn't that interesting by itself, even though it's running just over SSH. Can we do better?
But a simple HTTP server isn't that interesting by itself, even though it's running over SSH instead of a socket. Can we do better?
Yes, we can. We get all the features that we'd expect, including support for WebSockets but that's beyond the scope of this project.
Yes, we can. We get all of the nice HTTP features that we'd expect, including support for [WebSocket](https://en.wikipedia.org/wiki/WebSocket) but that's beyond the scope of this post.
What I'm more interested in is pushing the limits of this solution in terms of simple HTTP. And since I was just starting to learn [htmx](https://htmx.org/), it seemed like the perfect opportunity to put it to the proof.
What I'm more interested in is pushing the limits of this solution in terms of simple HTTP. And since I was just starting to learn [htmx](https://htmx.org/), it seemed like the perfect opportunity to put it to the test.
At first, I made a simple application that stores a bunch of checkboxes, updates them when you click them, and then periodically polls the server to grab modifications done by other users. It was a dumb but easy idea to implement, but it didn't stop there.
At first, I made a simple application that stores data for a bunch of checkboxes, then updates them when you click them, and periodically polls the server to grab modifications done by other users. It was a dumb but easy idea to implement, but I didn't stop there.
<figure>
<Image
src={imageCheckboxes}
alt="A screenshot of a browser with a Web 1.0-looking picross or nonogram grid, entitled Multipaint by Numbers, and containing instructions on how to play, as well as multiple colorful cursors."
/>
<figcaption>Running a poor man's clone of A Million Checkboxes.</figcaption>
<figcaption>Running a poor man's clone of One Million Checkboxes.</figcaption>
</figure>
Seeing the checkboxes getting filled and emptied in a grid reminded me a lot of nonograms. You might know them by picross or paint by numbers or not know them at all, but they are a kind of puzzle made by filling cells in a grid in order to reveal a pixelated image. So I thought, why not make a multiplayer version of it? And I called it Multipaint by Numbers, and worked on it over a few days.
Seeing the checkboxes getting filled and emptied in a grid reminded me a lot of [nonograms](https://en.wikipedia.org/wiki/Nonogram). You might know them by "picross" or "paint by numbers", or not know them at all, but they are a kind of puzzle made by filling cells in a grid in order to reveal a pixelated image. So I thought, why not make a multiplayer version of it? And I called it Multipaint by Numbers, and worked on it over the next several days.
<figure>
<Image
@ -491,25 +508,39 @@ Seeing the checkboxes getting filled and emptied in a grid reminded me a lot of
<figcaption>A screenshot of me playing Multipaint by Numbers together with myself.</figcaption>
</figure>
It's a janky mess, but it technically works. It has click-and-drag, it has flagging, and it even has cursors that (try to) match those of other people currently playing as well! It's certainly janky, as HTMX wasn't designed for highly interactive applications like this one, but it was also quite a breath of fresh air compared to writing Javascript. In general, it was a fun learning experience.
It's a janky mess, sure, but it definitely works! It has click-and-drag, it has flagging, and it even has cursors that (try to) match those of other people currently playing as well! It definitely feels buggy (rather than _actually_ being buggy) and unresponsive, since HTMX wasn't designed for highly interactive applications like this one. But it was quite a fun learning experience! If you've dabbled in full-stack development, I highly recommend checking out HTMX if you haven't compared to JS, it's like a breath of fresh air.
It's available to play on https://multipaint.sish.top, and you can see the source code [here](https://github.com/BadMannersXYZ/htmx-ssh-games).
But if you just wanna play it yourself, Multipaint by Numbers is available to play on https://multipaint.sish.top, and you can find the source code [here](https://github.com/BadMannersXYZ/htmx-ssh-games).
## Awaiting the future
Of course, none of these projects do anything special. It'd be better to just make them as plain HTTP applications, after all. Why go through such a roundabout way to make webapps?
But there was reason. These were just toy projects to learn the basics about russh, Axum, and HTMX. And I was hoping to go in a new direction with this knowledge.
But there was a reason. Both "400 Checkboxes" and "Multipaint by Numbers" were just toy projects to learn the basics about russh, Axum, and HTMX. And I was hoping to go in a new direction with this knowledge.
Recall from my previous post the motivation for looking at SSH reverse port forwarding, in the first place: I wanted to expose services from my home network that would otherwise get blocked by NAT.
This ties in with an idea I've had for a game for a while. It's supposed to run on on your computer, but is controlled remotely through the phone (or a separate browser window), with interactivity that connects and synchronizes both ends. They could be running on the same network, but maybe people use their phone on cellular data, with a wildly different [ASN](<https://en.wikipedia.org/wiki/Autonomous_system_(Internet)>) backing it up.
This ties in with an idea I've had for a game for a while. It's supposed to run on on your computer, but is controlled remotely through the phone (or a separate browser window), with interactivity that connects and synchronizes both ends. They could be running on the same network, but maybe people use their phone on cellular data, therefore having a completely different [ASN](<https://en.wikipedia.org/wiki/Autonomous_system_(Internet)>) backing it up.
If you are familiar with [WebRTC](https://en.wikipedia.org/wiki/WebRTC), you might be thinking that this isn't so different from a [TURN server](https://webrtc.org/getting-started/turn-server), and it's definitely similar! But for my project, I think that an SSH-based solution might work better:
Well, what if you could connect your phone to your computer without worrying about NAT? What if it was as simple as accessing a web page? What if the phone interactions were as simple as touching buttons in a mobile-first webapp?
If you've played any of the [Jackbox Party games](https://en.wikipedia.org/wiki/The_Jackbox_Party_Pack), you're already familiar with this kind of architecture (and it was one of the inspirations for this idea!). The only difference is that a single device will connect to the game, instead of multiple players.
On the other hand, if you are familiar with [WebRTC](https://en.wikipedia.org/wiki/WebRTC), you might be thinking that this isn't so different from a [TURN server](https://webrtc.org/getting-started/turn-server), and it's definitely similar! But for my project, I think that an SSH-based solution might work better:
- Setting up a new sish instance for my project is not very complicated, whereas WebRTC makes me want to pull my hair.
- I was already planning on using HTML for the mobile interface (instead of a native app, for instance), so a hypermedia-driven library like HTMX may suit my needs better than translating the plain data that WebRTC sends.
- I was already planning on using HTML for the mobile interface (instead of, say, [a native app](https://aws.amazon.com/compare/the-difference-between-web-apps-native-apps-and-hybrid-apps/)), so a hypermedia-driven library like HTMX may suit my needs better than translating the plain data that WebRTC sends.
- However, it'd still require some Javascript on the mobile end of it, for things like [rollback netcode](https://en.wikipedia.org/wiki/Netcode#Rollback).
- SSH already comes with built-in authentication and encryption mechanisms, meaning that I wouldn't have to roll out my own. (In fact, the people who are behind sish and tuns.sh leverage this plus _forward_ TCP connections [to create SSH tunnel-based logins for services](https://pico.sh/tunnels).)
- SSH already comes with built-in authentication and encryption mechanisms, meaning that I wouldn't have to roll out my own. (In fact, the people behind sish and tuns.sh leverage this feature of SSH plus _forward_ TCP connections to create [tunnel-based logins for services](https://pico.sh/tunnels).)
- The dependency on SSH is transparent, letting me work on the communication channel as if it were a plain webserver or if it were any other application, for that matter. There is no lock-in to a specific technology like there is with WebRTC.
- Since I plan on having a web interface on the mobile device anyway, this scheme avoids adding extra logic for a separate webserver. The sish proxy will only handle upgrading our HTTP connection to HTTPS essentially, and the webserver can be embedded on the computer application, similarly to a [thin client](https://en.wikipedia.org/wiki/Thin_client).
TO-DO
With that said, there's nothing inherently wrong with WebRTC though (other than it being [a complex mess of protocols](https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/Protocols)), and I'm not dismissing it straight away for this project.
Our chosen path still has disadvantages too, one of them being that I'd be forced to use [TCP](https://en.wikipedia.org/wiki/Transmission_Control_Protocol) for communication. But since having a web interface on the remote controller was already part of the initial idea, that'd be unavoidable even if I picked WebRTC for the project. Another challenge is the added overhead of the proxy server, but with proper latency-based rollback, this can be mitigated and isn't so different from what would happen with a TURN server, really.
One final bonus that we get over a traditional client-server architecture is keeping the responsibilities as they should actually be. Normally, this kind of game would require a central server that coordinates with two or more clients the computer running the game, and the mobile phone(s) running the controller. With remote port forwarding, the computer **will** be the server, exposed directly through the Internet. The mobile phone will be a regular client of that server, and there's no opaque abstraction over their communication other than HTTP itself.
I've had this idea for a while now, but I was struggling to make it work with a traditional server-side architecture. It turns out that I don't need to implement anything myself. sish is configurable enough that it can serve many purposes, be it hosting multiple services, or managing multiple game connections. And for my project, it's definitely a viable solution that I'll look more into.
But for now, that's all I have to say about it. I hope that this blog post has given a good insight about the inner workings of SSH, and perhaps even gave you ideas to try out yourself!