Useful OCaml Patterns for Competitive Programming

Competitive programming is not just a fun way to level up your algorithmic skills—it’s also a game of efficiency and clever implementations. While it shares some similarities with regular software development, competitive programming often requires specialized techniques to handle input parsing, performance optimizations, and more. In this post, we’ll explore some essential OCaml patterns that help bridge the gap between everyday coding and competitive programming, especially when solving problems on platforms like AtCoder.

Handling Standard Input

In competitive programming, problems typically require reading input from stdin and printing results to stdout. OCaml provides several ways to parse input, each suited for different scenarios.

To handle the input, the following straightforward ones work well:

read_line () → Reads a line as a string.
read_int () → Reads a line as an int.

But they are sometimes insufficient for complex input formats, and you may consider to use Scanf or customized functions.

Parsing Multiple Values from a Single Line

For inputs like 13 7 (two integers on one line), Scanf.sscanf is your friend. It allows pattern-matching with format strings:

let sum = Scanf.sscanf (read_line ()) "%d %d" (fun a b -> a + b)

You can also capture values separately by returning a tuple:

let x, y = Scanf.sscanf (read_line ()) "%d %d" (fun x y -> (x, y))

This approach surprisingly retains static type safety. This does not help a lot in the field of competitive programming, but it is a great win for software programming in the real world.

Reading Sequences of Numbers

When dealing with sequences, you definitely need a data structure for it rather than put them all into different variables.

List: Ideal for functional operations (e.g., map, fold), but slow for random access.
Array: Offers O(1) access, crucial for performance-heavy tasks.

To directly gain these data structure from the standard input, you can use the following custom functions:

(* read_int_list : unit -> int list *)
let read_int_list () =
  String.split_on_char ' ' (read_line ())
  |> List.map int_of_string

(* read_int_array : unit -> int array *)
let read_int_array n =
  Array.init n @@ fun _ -> Scanf.scanf " %d" @@ (+) 0

Memoization for Dynamic Programming

Memoization caches function results to avoid redundant calculations. This is pretty useful typically for dynamic programming (DP) that requires repeated uses of subproblems.

Take a problem from Education Dynamic Programming Contest (EDPC) from AtCoder as an example:

There are $N$ stones, numbered $1, 2, \cdots, N$ . For each $i$ (1 \leq i \leq N $), the height of Stone$ i $is$ h_i$.

There is a frog who is initially on Stone $1$ . He will repeat the following action some number of times to reach Stone $N$ :

If the frog is currently on Stone $i$ , jump to Stone $i + 1$ or Stone or Stone $i + 2$ . Here, a cost of $|h_i - h_j|$ is incurred, where $j$ is the stone to land on.

Find the minimum possible total cost incurred before the frog reaches Stone $N$ .

https://atcoder.jp/contests/dp/tasks/dp_a

The optimal cost for Stone $N$ can be computed by comparing the costs for Stones $N-1$ and $N-2$ . Recursively they can be computed by computing stones with lower indices, and by memoizing these subproblems, we can avoid recalculating them repeatedly, which results in passing time-complexity requirements.

Implementing Memoization with `Hashtbl`

The key concept of memoization is to record computed values and they will be used without any computation when the same input is given to a function. Any type of record works such as Array or record, but we often use Hashtbl to achieve memoization/

OCaml’s Hashtbl provides fast lookups and storage:

Hashtbl.create size → Creates a hash table.
Hashtbl.find tbl key → Retrieves a value (raises Not_found if missing).
Hashtbl.add tbl key value → Stores a key-value pair.

Here’s a reusable memoize function:

let memoize size f =
  let h = Hashtbl.create size in
  let rec g x =
    try Hashtbl.find h x
    with Not_found ->
      let y = f g x in
      Hashtbl.add h x y;
      y
  in
  g

The way of implementing a recursive function is a bit trickly here. f is not implemented with rec notation and instead it takes self as its first argument, enabling recursive calls via the memoized wrapper.

  let f self i = match i with
    | 0 -> 0
    | 1 -> abs (hs.(0) - hs.(1))
    | _ ->
      let step d = (self (i - d)) + abs (hs.(i - d) - hs.(i)) in
      min (step 1) (step 2)
  in
  memoize n f