Skip to main content

Real-Time Communication in Go: TCP Sockets, Message Framing, and WebSockets

·8 mins
Table of Contents
TCP is a stream protocol, not a message protocol. If you read bytes into a fixed buffer, you will silently truncate messages larger than that buffer. You need framing. This post covers length-prefixed framing, a multi-client broadcast server, and WebSocket support for browser clients.

The first thing most tutorials about sockets in Go get wrong is the buffer. Reading into make([]byte, 1024) is not message-oriented. A 1025-byte message gets split across two reads. A 512-byte message and a 300-byte message can arrive in one read. TCP is a stream – you need to add structure on top of it.

The framing problem
#

Consider this common pattern from tutorials:

buf := make([]byte, 1024)
n, err := conn.Read(buf)
msg := string(buf[:n])

This has two bugs:

  1. If the sender writes 2000 bytes, the receiver gets 1024 bytes and silently discards the rest.
  2. If the sender writes two messages back to back, the receiver may get both in a single Read call with no boundary between them.

You need a framing protocol. The simplest reliable one is a 4-byte length prefix.

Length-prefixed framing
#

Write the message length as a 4-byte big-endian integer before the message body. The reader reads exactly 4 bytes first, learns the message size, then reads exactly that many bytes.

framing/framing.go
package framing

import (
	"encoding/binary"
	"fmt"
	"io"
	"net"
)

const maxMessageSize = 16 * 1024 * 1024 // 16 MB safety cap

// WriteMessage writes a length-prefixed message to conn.
func WriteMessage(conn net.Conn, data []byte) error {
	if len(data) > maxMessageSize {
		return fmt.Errorf("message too large: %d bytes", len(data))
	}
	header := make([]byte, 4)
	binary.BigEndian.PutUint32(header, uint32(len(data)))
	if _, err := conn.Write(header); err != nil {
		return fmt.Errorf("write header: %w", err)
	}
	if _, err := conn.Write(data); err != nil {
		return fmt.Errorf("write body: %w", err)
	}
	return nil
}

// ReadMessage reads a length-prefixed message from conn.
// It blocks until a complete message is available.
func ReadMessage(conn net.Conn) ([]byte, error) {
	header := make([]byte, 4)
	if _, err := io.ReadFull(conn, header); err != nil {
		return nil, fmt.Errorf("read header: %w", err)
	}
	size := binary.BigEndian.Uint32(header)
	if size > maxMessageSize {
		return nil, fmt.Errorf("message size %d exceeds limit", size)
	}
	body := make([]byte, size)
	if _, err := io.ReadFull(conn, body); err != nil {
		return nil, fmt.Errorf("read body: %w", err)
	}
	return body, nil
}

io.ReadFull is key here. It calls Read in a loop until it fills the buffer exactly, handling the case where the kernel delivers bytes in multiple chunks.

Note

The maxMessageSize cap protects against a malicious or buggy client sending a 4GB length prefix that causes your server to make([]byte, 4_000_000_000) and OOM. Always validate the size before allocating.

Multi-client TCP server with broadcast
#

A production-worthy server handles each client in its own goroutine and routes messages to all other connected clients through a central broadcast channel:

server/server.go
package server

import (
	"fmt"
	"log"
	"net"
	"sync"

	"github.com/yourname/chat/framing"
)

type Server struct {
	mu      sync.RWMutex
	clients map[net.Conn]struct{}
	bcast   chan []byte
}

func New() *Server {
	return &Server{
		clients: make(map[net.Conn]struct{}),
		bcast:   make(chan []byte, 256),
	}
}

func (s *Server) Run(addr string) error {
	ln, err := net.Listen("tcp", addr)
	if err != nil {
		return fmt.Errorf("listen: %w", err)
	}
	defer ln.Close()
	log.Printf("TCP server listening on %s", addr)

	go s.broadcaster()

	for {
		conn, err := ln.Accept()
		if err != nil {
			log.Printf("accept: %v", err)
			continue
		}
		s.addClient(conn)
		go s.handleConn(conn)
	}
}

func (s *Server) broadcaster() {
	for msg := range s.bcast {
		s.mu.RLock()
		for conn := range s.clients {
			// Set a write deadline so a slow client cannot block the broadcast.
			_ = conn.SetWriteDeadline(timeoutFrom(2))
			if err := framing.WriteMessage(conn, msg); err != nil {
				log.Printf("broadcast write: %v", err)
			}
		}
		s.mu.RUnlock()
	}
}

func (s *Server) handleConn(conn net.Conn) {
	defer func() {
		s.removeClient(conn)
		conn.Close()
	}()

	for {
		// Read deadline: disconnect idle clients after 60s.
		_ = conn.SetReadDeadline(timeoutFrom(60))

		msg, err := framing.ReadMessage(conn)
		if err != nil {
			log.Printf("read from %s: %v", conn.RemoteAddr(), err)
			return
		}
		s.bcast <- msg
	}
}

func (s *Server) addClient(conn net.Conn) {
	s.mu.Lock()
	s.clients[conn] = struct{}{}
	s.mu.Unlock()
	log.Printf("client connected: %s (total: %d)", conn.RemoteAddr(), s.clientCount())
}

func (s *Server) removeClient(conn net.Conn) {
	s.mu.Lock()
	delete(s.clients, conn)
	s.mu.Unlock()
	log.Printf("client disconnected: %s (total: %d)", conn.RemoteAddr(), s.clientCount())
}

func (s *Server) clientCount() int {
	s.mu.RLock()
	defer s.mu.RUnlock()
	return len(s.clients)
}
Warning

The broadcast loop holds a read lock while writing to all clients. If one client is slow, it blocks the broadcast for everyone else for up to the write deadline (2 seconds). For high-throughput systems, give each client a dedicated write channel and goroutine so a slow client is dropped without blocking others.

WebSocket server with gorilla/websocket
#

Browser clients cannot open raw TCP connections. WebSocket runs over HTTP and works in every browser. Use gorilla/websocket:

wsserver/wsserver.go
package wsserver

import (
	"log"
	"net/http"
	"sync"
	"time"

	"github.com/gorilla/websocket"
)

var upgrader = websocket.Upgrader{
	ReadBufferSize:  1024,
	WriteBufferSize: 1024,
	CheckOrigin: func(r *http.Request) bool {
		return true // Restrict in production: check r.Header.Get("Origin")
	},
}

type Hub struct {
	mu      sync.RWMutex
	clients map[*websocket.Conn]struct{}
	bcast   chan []byte
}

func NewHub() *Hub {
	h := &Hub{
		clients: make(map[*websocket.Conn]struct{}),
		bcast:   make(chan []byte, 256),
	}
	go h.broadcaster()
	return h
}

func (h *Hub) ServeHTTP(w http.ResponseWriter, r *http.Request) {
	conn, err := upgrader.Upgrade(w, r, nil)
	if err != nil {
		log.Printf("upgrade: %v", err)
		return
	}
	h.addClient(conn)
	defer func() {
		h.removeClient(conn)
		conn.Close()
	}()

	conn.SetReadLimit(16 * 1024 * 1024)
	conn.SetReadDeadline(time.Now().Add(60 * time.Second))
	conn.SetPongHandler(func(_ string) error {
		conn.SetReadDeadline(time.Now().Add(60 * time.Second))
		return nil
	})

	for {
		_, msg, err := conn.ReadMessage()
		if err != nil {
			if websocket.IsUnexpectedCloseError(err, websocket.CloseGoingAway, websocket.CloseAbnormalClosure) {
				log.Printf("ws read error: %v", err)
			}
			return
		}
		h.bcast <- msg
	}
}

func (h *Hub) broadcaster() {
	ticker := time.NewTicker(30 * time.Second)
	defer ticker.Stop()

	for {
		select {
		case msg := <-h.bcast:
			h.mu.RLock()
			for conn := range h.clients {
				conn.SetWriteDeadline(time.Now().Add(2 * time.Second))
				if err := conn.WriteMessage(websocket.TextMessage, msg); err != nil {
					log.Printf("ws write: %v", err)
				}
			}
			h.mu.RUnlock()
		case <-ticker.C:
			// Send ping to all clients to keep connections alive.
			h.mu.RLock()
			for conn := range h.clients {
				conn.SetWriteDeadline(time.Now().Add(2 * time.Second))
				_ = conn.WriteMessage(websocket.PingMessage, nil)
			}
			h.mu.RUnlock()
		}
	}
}

func (h *Hub) addClient(conn *websocket.Conn) {
	h.mu.Lock()
	h.clients[conn] = struct{}{}
	h.mu.Unlock()
}

func (h *Hub) removeClient(conn *websocket.Conn) {
	h.mu.Lock()
	delete(h.clients, conn)
	h.mu.Unlock()
}

TCP vs WebSocket comparison
#

Latency: Lowest. No HTTP framing overhead.

Protocol overhead: Minimal. Just your framing bytes.

Browser support: None. Browsers cannot open raw TCP sockets.

Use cases: Game servers, inter-service communication, custom protocols, IoT devices.

Framing: You implement it (length-prefix, delimiter, or protobuf varint).

TLS: You manage it with crypto/tls.

Latency: Slightly higher. HTTP upgrade handshake at connection time, WebSocket framing on each message (~6 bytes overhead per message).

Protocol overhead: Small but non-zero per frame.

Browser support: Full. Supported in every modern browser since 2011.

Use cases: Chat apps, live dashboards, collaborative tools, any browser-facing real-time feature.

Framing: Built-in. gorilla/websocket handles message boundaries automatically.

TLS: Handled by the HTTP server (use ListenAndServeTLS or a reverse proxy).

Message framing options
#

flowchart TD
    Q1{What are your constraints?}
    Q1 -->|Simple text messages| Delim["Delimiter-based\n(newline \\n per message)"]
    Q1 -->|Binary or variable-length| LenPfx["Length-prefix\n(4-byte header + body)"]
    Q1 -->|Already using protobuf| Varint["Protobuf varint framing\n(self-describing length)"]
    Delim -->|Downside| D1["Delimiter must be escaped\nin message content"]
    LenPfx -->|Downside| L1["Fixed max message size\n(set at design time)"]
    Varint -->|Downside| V1["Requires protobuf\ndependency"]

Production considerations
#

Connection limits. The OS limits open file descriptors per process. On Linux the default is 1024. For a chat server with thousands of clients, set ulimit -n 65536 and configure fs.file-max in /etc/sysctl.conf. Each goroutine for a connection also uses ~8KB of stack by default.

Read and write deadlines. Always set them. conn.SetReadDeadline disconnects clients that stop sending (detecting half-open connections). conn.SetWriteDeadline prevents a slow client from blocking your broadcast goroutine.

Ping/pong keepalives. TCP keepalive packets (OS-level) may take minutes to detect a dropped connection. Application-level pings (WebSocket ping frames, or a custom message in TCP) detect disconnection much faster.

Graceful shutdown. Listen for SIGTERM, stop accepting new connections, drain the broadcast channel, then close all connections. sync.WaitGroup tracks in-flight goroutines.

**No framing.** Reading with a fixed-size buffer silently truncates or merges messages. Always use a framing protocol -- length-prefix is the simplest correct approach. **No read or write deadlines.** A client that stops responding keeps its goroutine and file descriptor alive indefinitely. Always set deadlines with `SetReadDeadline` and `SetWriteDeadline`. **Blocking broadcast.** If the broadcast loop writes to clients sequentially without write deadlines, one slow client stalls everyone else. Add per-write deadlines, or give each client a buffered channel and dedicated write goroutine. **One goroutine for all connections.** Handling all connections in a single goroutine with a select loop does not scale. Goroutine-per-connection is idiomatic Go and scales well to thousands of concurrent clients. **Ignoring io.ReadFull.** Using a plain `conn.Read` into a slice does not guarantee the buffer is filled. TCP may deliver data in chunks. Use `io.ReadFull` for exact-size reads.

When to use sockets, NATS, or gRPC
#

NeedRecommendation
Browser clientsWebSocket
Inter-service messaging, pub/sub, fan-outNATS or Kafka
Typed RPC between servicesgRPC
Raw binary protocol, full controlTCP with custom framing
Low-latency game or simulationUDP with your own reliability layer

If you want to go deeper on any of this, I offer 1:1 coaching sessions for engineers working on AI integration, cloud architecture, and platform engineering. Book a session (50 EUR / 60 min) or reach out at manuel.fedele+website@gmail.com.

Related