Implementing LLM Response Caching with Redis

Caching LLM responses saves money and improves latency:

import { createHash } from 'crypto';
import Redis from 'ioredis';

const redis = new Redis();
const CACHE_TTL = 3600; // 1 hour

function hashPrompt(messages, model) {
  const content = JSON.stringify({ messages, model });
  return createHash('sha256').update(content).digest('hex');
}

async function cachedChat(messages, options = {}) {
  const { model = 'gpt-4', bypassCache = false } = options;
  const cacheKey = `llm:${hashPrompt(messages, model)}`;

  if (!bypassCache) {
    const cached = await redis.get(cacheKey);
    if (cached) {
      console.log('Cache HIT');
      return JSON.parse(cached);
    }
  }

  console.log('Cache MISS');
  const response = await openai.chat.completions.create({
    model,
    messages
  });

  await redis.setex(cacheKey, CACHE_TTL, JSON.stringify(response));
  return response;
}

Semantic Caching

For similar (not exact) queries, use embedding similarity with a threshold.

Text Chunking Strategies for RAG Applications

Chunking strategies greatly affect RAG quality:

import { RecursiveCharacterTextSplitter } from 'langchain/text_splitter';

// Basic chunking
const splitter = new RecursiveCharacterTextSplitter({
  chunkSize: 1000,
  chunkOverlap: 200,
  separators: ['

', '
', ' ', '']
});

const chunks = await splitter.splitText(document);

Semantic Chunking

async function semanticChunk(text, maxTokens = 500) {
  const sentences = text.match(/[^.!?]+[.!?]+/g) || [text];
  const chunks = [];
  let current = [];
  let tokenCount = 0;

  for (const sentence of sentences) {
    const tokens = sentence.split(/s+/).length; // Approximate
    if (tokenCount + tokens > maxTokens && current.length) {
      chunks.push(current.join(' '));
      current = [];
      tokenCount = 0;
    }
    current.push(sentence);
    tokenCount += tokens;
  }
  if (current.length) chunks.push(current.join(' '));
  return chunks;
}

Best Practices

  • Chunk size: 500-1000 tokens
  • Overlap: 10-20% for context
  • Preserve semantic boundaries

Building Conversational AI with Context Memory in Node.js

Building a conversational AI with memory requires careful context management:

class ConversationManager {
  constructor(options = {}) {
    this.maxTokens = options.maxTokens || 4000;
    this.systemPrompt = options.systemPrompt || 'You are a helpful assistant.';
    this.conversations = new Map();
  }

  getHistory(sessionId) {
    if (!this.conversations.has(sessionId)) {
      this.conversations.set(sessionId, []);
    }
    return this.conversations.get(sessionId);
  }

  async chat(sessionId, userMessage) {
    const history = this.getHistory(sessionId);
    history.push({ role: 'user', content: userMessage });

    // Trim history if too long
    while (this.estimateTokens(history) > this.maxTokens) {
      history.shift();
    }

    const response = await openai.chat.completions.create({
      model: 'gpt-4',
      messages: [
        { role: 'system', content: this.systemPrompt },
        ...history
      ]
    });

    const reply = response.choices[0].message.content;
    history.push({ role: 'assistant', content: reply });
    return reply;
  }

  estimateTokens(messages) {
    return messages.reduce((sum, m) => sum + m.content.length / 4, 0);
  }
}

Async waterfall example nodejs

To avoid callback hell , a very useful tool for structuring calls in a sequence and make sure the steps pass the data to the next step.

var async = require('async');

async.waterfall(
    [
        function(callback) {
            callback(null, 'Yes', 'it');
        },
        function(arg1, arg2, callback) {
            var caption = arg1 +' and '+ arg2;
            callback(null, caption);
        },
        function(caption, callback) {
            caption += ' works!';
            callback(null, caption);
        }
    ],
    function (err, caption) {
        console.log(caption);
        // Node.js and JavaScript Rock!
    }
);

nJoy 😉

Node.js Start script or run command and handoff to OS (no waiting)

Sometimes you want to run something in the OS from your node code but you do not want to follow it or have a callback posted on your stack. The default for child_process spawning is to hold on to a handle and park a callback on the stack. however there is an option which is called detached.

var spawn = require('child_process').spawn;
spawn('/usr/scripts/script.sh', ['param1'], {
    detached: true
});

You can even setup the environment of the call to add stdio and stderr pipes to the call and connect them to OS fs descriptors like this:

var fs = require('fs'),
    spawn = require('child_process').spawn,
    out = fs.openSync('./out.log', 'a'),
    err = fs.openSync('./err.log', 'a');

spawn('/usr/scripts/script.sh', ['param1'], {
    stdio: [ 'ignore', out, err ], // piping stdout and stderr to out.log
    detached: true
}).unref();

The unref disconnects the process. from the parent it equates to a disown in shell.

Thanks to O/P : https://stackoverflow.com/questions/25323703/nodejs-execute-command-in-background-and-forget

Also Ref: https://nodejs.org/api/child_process.html#child_process_child_process_spawn_command_args_options

and

https://github.com/nodejs/node-v0.x-archive/issues/9255

nJoy 😉

Identify OS on remote host

For nmap to even make a guess, nmap needs to find at least 1 open and 1 closed port on a remote host. Using the previous scan results, let us find out more about the host 192.168.0.115:

# nmap -O -sV 192.168.0.115

Output:

Starting Nmap 7.80 ( https://nmap.org ) at 2020-10-02 12:21 CEST
Nmap scan report for 192.168.0.115
Host is up (0.00023s latency).
Not shown: 991 closed ports
PORT      STATE SERVICE     VERSION
22/tcp    open  ssh         OpenSSH 5.1 (protocol 2.0)
80/tcp    open  http        Apache httpd 2.2.19 ((Unix) mod_ssl/2.2.19 OpenSSL/0.9.8zf DAV/2)
111/tcp   open  rpcbind     2 (RPC #100000)
139/tcp   open  netbios-ssn Samba smbd 3.X - 4.X (workgroup: WORKGROUP)
443/tcp   open  ssl/http    Apache httpd 2.2.19 ((Unix) mod_ssl/2.2.19 OpenSSL/0.9.8zf DAV/2)
445/tcp   open  netbios-ssn Samba smbd 3.X - 4.X (workgroup: WORKGROUP)
873/tcp   open  rsync       (protocol version 29)
2049/tcp  open  nfs         2-4 (RPC #100003)
49152/tcp open  upnp        Portable SDK for UPnP devices 1.6.9 (Linux 2.6.39.3; UPnP 1.0)
MAC Address: 00:26:2D:06:39:DB (Wistron)
Device type: general purpose
Running: Linux 2.6.X|3.X
OS CPE: cpe:/o:linux:linux_kernel:2.6 cpe:/o:linux:linux_kernel:3
OS details: Linux 2.6.38 - 3.0
Network Distance: 1 hop
Service Info: OS: Linux; CPE: cpe:/o:linux:linux_kernel:2.6.39.3


OS and Service detection performed. Please report any incorrect results at https://nmap.org/submit/ .
Nmap done: 1 IP address (1 host up) scanned in 14.58 seconds

nJoy 😉

How to quit ESXi SSH and leave background tasks running

In Linux when a console session is closed most background jobs (^Z and bg %n) will stop running when the parent ( the ssh session ) is closed because the parent sends a SIGHUP to all its children when closing (properly). Some programs can catch and ignore the SIGHUP or not handle it at all hence passing to the root init parent. The disown command in a shell removes a background job from the list to send SIGHUPs to.

In ESXi there is no disown command. However there is a way to close a shell immediately without issuing the SIGHUPs :

exec </dev/null >/dev/null 2>/dev/null

The exec command will run a command and switch it out for the current shell. Also this command will make sure the stdio and stderr are piped properly.

nJoy 😉

DHCP debugging with tcpdump


tcpdump filter to match DHCP packets including a specific Client MAC Address:

tcpdump -i br0 -vvv -s 1500 '((port 67 or port 68) and (udp[38:4] = 0x3e0ccf08))'

tcpdump filter to capture packets sent by the client (DISCOVER, REQUEST, INFORM):

tcpdump -i br0 -vvv -s 1500 '((port 67 or port 68) and (udp[8:1] = 0x1))'

Sample output :


 21:38:05.644153 IP (tos 0x0, ttl 64, id 32104, offset 0, flags [none], proto UDP (17), length 374)
     0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from 12:42:82:cb:7a:7e (oui Unknown), length 346, xid 0xd01f0ad4, secs 18694, Flags [none] (0x0000)
   Client-Ethernet-Address 12:42:82:cb:7a:7e (oui Unknown)
   Vendor-rfc1048 Extensions
     Magic Cookie 0x63825363
     DHCP-Message Option 53, length 1: Discover
     Client-ID Option 61, length 7: ether 12:42:82:cb:7a:7e
     SLP-NA Option 80, length 0""
     NOAUTO Option 116, length 1: Y
     MSZ Option 57, length 2: 1472
     Vendor-Class Option 60, length 49: "dhcpcd-7.1.0:Linux-4.19.59-sunxi:armv7l:Allwinner"
     Hostname Option 12, length 11: "whiteorange"
     T145 Option 145, length 1: 1
     Parameter-Request Option 55, length 15: 
       Subnet-Mask, Classless-Static-Route, Static-Route, Default-Gateway
       Domain-Name-Server, Hostname, Domain-Name, MTU
       BR, NTP, Lease-Time, Server-ID
       RN, RB, Option 119
     END Option 255, length 0 

…And yes it is an orangepi zero for those playing at home..

nJoy 😉

Convert Large numbers to binary in Excel

=DEC2BIN(MOD(QUOTIENT($A$1,16^7),16),4)&" "&DEC2BIN(MOD(QUOTIENT($A$1,16^6),16),4)&" "&DEC2BIN(MOD(QUOTIENT($A$1,16^5),16),4)&" "&DEC2BIN(MOD(QUOTIENT($A$1,16^4),16),4)&" "&DEC2BIN(MOD(QUOTIENT($A$1,16^3),16),4)&" "&DEC2BIN(MOD(QUOTIENT($A$1,16^2),16),4)&" "&DEC2BIN(MOD(QUOTIENT($A$1,16^1),16),4)&" "&DEC2BIN(MOD(QUOTIENT($A$1,16^0),16),4)

The above code will convert large numbers up to 32 bit.

nJoy 😉