Sometimes I like to tinker a bit with the frontend part of development. Thus, I decided to write a computer vision app in React. Although there are great tools like Gradio or FiftyOne to visualize your work, there are no plug-and-play systems that can be presented to the customer. The app will not be perfect at all, but it may give you some hints on how such an app can be designed or you can use it as a starting point and adapt the code to your needs.
This post is a bit longer as it defines the basis of a React app for computer vision. You can find the complete code in this Repo and you can try the app here.
Introduction
I will explain the main building blocks of the app but not every detail. Thus, this post is not beginner-friendly. This post is more like a personal documentation for my side project and should provide one of many solutions for a given problem. If there are any questions about details or suggestions for improvement, please let me know.
Before we start, let me first introduce our frontend tech stack. My background is in Data Science, and I later learned some frontend. Thus I don’t like to operate on the DOM directly, but thankfully React abstracted web development to a declarative programming style. We will only use functional components and ES6 syntax. Furthermore, I am not a CSS wizard, so I’m happy with pre-styled components from Material-UI (MUI). We will also use TypeScript as it is more complicated initially but is handy when the application grows. Lastly, we will use Redux Toolkit as a state management tool. Once you learn it, it makes things a lot easier.
Although, most of the mobile deep learning applications use TensorflowJS, training computer vision models with Tensorflow is always a bit odd. For Pytroch exist many great libraries, like Icevision, which speed up the process enormously. Thus, we use ONNXruntime to execute models, which can be created from all state-of-the-art deep learning frameworks.
Setup
I will list all steps necessary for an ubuntu installation, but you will find plenty of material online for other operating systems. When you do not have already Node.js with Yarn as a package manager installed, you can do this in 3 simple steps:
# 1. Install node cd ~ curl -sL https://deb.nodesource.com/setup_16.10.0 -o nodesource_setup.sh sudo bash nodesource_setup.sh sudo apt install nodejs # 2. Install npm sudo apt-get install npm # 3. Install yarn npm install --global yarn
Now that you have installed the necessary environment, we can use create-react-app
to generate a good starting point for our React app for computer vision models. We will use a create-react-app
template with Redux and Typescript:
npx create-react-app cv-app --template redux-typescript
And that’s it! Unbelievable easy to create a React app environment these days. Now, we need to install MUI, MUI icons, and uuid:
yarn add @mui/material @emotion/react @emotion/styled @mui/icons-material uuid yarn add --dev @types/uuid
Uuid is just a tiny helper to create unique ids. The material stack has pre-styled and material-opinionated components, saving a lot of time.
Ok, now we have a working react app called cv-app on your computer. Open the top-level folder in your favorite IDE and let’s make some adjustments to the template: As I said, I’m happy with a completely declarative frontend, so we remove all CSS files, as we will add CSS with the style
prop of JSX (I know not the best practice, but it is ok for the size of the app) or with the sx
prop of MUI components. Furthermore, we can delete all the template code in App.tsx, the feature folder, and the references to the CSS files.
Then we add the <ThemeProvider>
component from MUI as the top-level component for the <App>
component. You can use the MUI theme builder to create your own theme with a good overview. I came up with this design:
import React from "react"; import { createTheme, ThemeProvider } from "@mui/material/styles"; import CustomImageList from "./components/CustomImageList"; import CustomAppbar from "./components/CustomAppbar"; import AddContentFab from "./components/AddContentFab"; export const theme = createTheme({ palette: { mode: "dark", primary: { main: "#8a43c1", }, secondary: { main: "#375ee8", }, background: { default: "#353434", paper: "#525252", }, }, spacing: 8, }); function App() { return ( <ThemeProvider theme={theme}> <CustomAppbar /> <CustomImageList /> <AddContentFab /> </ThemeProvider> ); } export default App;
Inside the <ThemeProvider>
component, we add our top-level components for our app. You can already see that we only use three components for our React app for computer vision. We will go into detail about these later. First, we need to have an idea about the visual appearance and the data types we will use.
Main Layout of our React app for computer vision
Before we start to write some code, we need an idea of what our app should look like. The main feature is an image view, which shows an image and its labels (classification, bounding boxes, or polygons). We display this main component inside a Modal for a larger view and in a list as an overview. Afterward, we need a button or FAB (Floating Action Button) to add an image. At last, we create an App bar to improve the visual appearance. You can see the app in the video above. Thus, we have four components in total:
<ImageCard>
: Displays image and main actions of images of an image<ImageList>
: Incorporates several instances of<ImageCards>
in a list<AddContentFab>
: Fixed button to add content<Appbar>
: To make it look more Material-ish
Defining types
One major advantage of TypeScript is that you define your data types beforehand and stick to them throughout the project, which throws errors directly at compilation. Since we focus on images only with this app, our central object is called imageCardObject:
export interface imageCardObject { id: string; src: string; title: string; dateCreated: string; highlighted: boolean; annotations: Array< bboxAnnotationObject | classAnnotationObject | polygonAnnotationObject >; }
It contains all relevant information about an image, like the source, the title, and the creation date. The src
stores a data URL (base64). On top of that, it contains a key called annotation that holds class, bounding box, and polygon annotations.
The different annotation types are very similar. All contain a className
, id
, score
, model and color
key. Each annotation type has a unique key, like the box
key that should be an array with [x1, y1, widht, height]
and is defined by a type
key. So typical data science stuff until here.
export interface baseAnnotationObject { className: string; color: string; id: string; score: number; model: string; } export interface classAnnotationObject extends baseAnnotationObject { type: "class"; } export interface bboxAnnotationObject extends baseAnnotationObject { box: Array<number>; type: "bbox"; } export interface polygonAnnotationObject extends baseAnnotationObject { polygon: Array<Array<number>>; // Array, that consists of Arrays with x, y points type: "polygon"; } export type anyAnnoationObject = | bboxAnnotationObject | polygonAnnotationObject | bboxAnnotationObject;
Besides the different annotation types, we also create an anyAnnoationObject
, a small wrapper to save some lines of code when we define a type, where any of the three annotation files are acceptable.
Setup redux for our datatypes
We define the redux state for images in imageStates.ts. Our main object is an array that contains instances of the imageCardObject
. Our reducer includes three methods to add, update and remove a given imageCardObject
. Here is the code:
import { createSlice } from "@reduxjs/toolkit"; import { imageCardObject } from "../util/types"; import { saveImageState, removeImageState, loadImageState, saveStateIds, loadStateIds, } from "../util/localStateHandler"; export const imageSlice = createSlice({ name: "images", initialState: { images: loadStateIds() .map((id: string) => loadImageState(id)) .filter((el: imageCardObject | null) => el !== null), }, reducers: { updateImage: (state, action) => { state.images = state.images.map((el: imageCardObject) => el.id === action.payload.id ? action.payload : el ); saveImageState(action.payload); }, addImage: (state, action) => { state.images = [...state.images, action.payload]; saveImageState(action.payload); saveStateIds(state.images.map((el: imageCardObject) => el.id)); }, removeImage: (state, action) => { state.images = [ ...state.images.filter( (el: imageCardObject) => el.id !== action.payload.id ), ]; removeImageState(action.payload.id); saveStateIds(state.images.map((el: imageCardObject) => el.id)); }, }, }); // Action creators are generated for each case reducer function export const { addImage, updateImage, removeImage } = imageSlice.actions; export default imageSlice.reducer;
For the add and update reducer, we expect an imageCardObject
as payload. When updating an object, we use the unique id to compare each object with our payload object to exchange the new object with the old one. Adding images is easier because we add the payload to the old state and update it. To remove an image, we only need the id and the filter
method and keep only objects that do not match the id.
As you can see, we import several functions from /util/localStateHandler.tsx and use them after each state update accordingly. This small util file saves the redux state to your local storage. However, this is not optimal because the state can be very large due to the whole image stored in it, and there are quotas for maximum item size for local storage. Thus, we implement a function to resize images that are too large later. Although loading the images to a server would be preferable, this is a topic for another blog post.
import { imageCardObject } from "./types"; const idName = "Ids"; export const saveStateIds = (event: string[]) => { localStorage.setItem(idName, JSON.stringify(event)); }; export const loadStateIds = (): string[] => { const ids = localStorage.getItem(idName); return ids !== null ? JSON.parse(ids) : []; }; export const saveImageState = (event: imageCardObject) => { localStorage.setItem(event.id, JSON.stringify(event)); }; export const loadImageState = (id: string): imageCardObject => { return JSON.parse(String(localStorage.getItem(id))); }; export const removeImageState = (id: string) => { localStorage.removeItem(id); };
We use the local storage for two things:
- Save a list with all images to create the initial state
- Save/load/remove each
imageCardObject
to/from local storage
The first part is done by the functions loadStateIds
and saveStateIds
. For loading, we parse the stringified list and return it, and for saving, we stringify the list of IDs and save them to local storage.
For loading images, we parse the loaded imageCardObject
and for saving, we stringify it. Furthermore, we write another function for deletion, which is also straightforward. Based on these functions, we can retrieve our initial state by loading all image IDs and mapping over them to load each imageCardObject
.
images: loadStateIds() .map((id: string) => loadImageState(id)) .filter((el: imageCardObject | null) => el !== null)
The Appbar

Let’s start with the Appbar, the most simple component . We can directly use MUI’s Appbar and add additional CSS with the sx
prop. Let’s make the Appbar sticky
. It follows you as you scroll down and always stays on the top of your viewed screen. To stick it on the top of the screen, we set thebottom
prop to auto and the top
prop to zero. Then, we need padding to enlarge the Appbar slightly for a nicer look. Lastly, we import the theme from our App.tsx
to obtain our secondary color.
import AppBar from "@mui/material/AppBar"; import Typography from "@mui/material/Typography"; import { theme } from "../App"; export default function CustomAppbar() { return ( <AppBarrom Material-UI. I will also use TypeScripot as it is more complicated in the beginning, but sx={{ position: "sticky", bottom: "auto", top: 0, padding: 2, background: theme.palette.secondary.main, }} > <Typography variant="h5" component="div" sx={{ flexGrow: 1 }}> Welcome! </Typography> </AppBar> ); }
The Image card object for our React app for computer vision

<ImageCard>
component, which displays images and their labels.This is our main component with roughly 260 lines of code. We need this component to understand the two other components (AddContentFab
and CustomImageList
) as they are only simple wrappers around this one.
Imports
The first import lines are the basic MUI components we use throughout the component. The second part imports the specific functions and interfaces of our project. We import all annotation types as well as the imageCardObject
type. Then we import a custom function called drawAnnotations
, that we use to draw annotations on an HTML canvas. The theme
is needed to access our color pattern of the app. useAppDispatch
is our typescript wrapper around redux’s useDispatch
, which is used to update a redux state. Lastly, we import some basic React hooks.
// MUI import Card from "@mui/material/Card"; import CardHeader from "@mui/material/CardHeader"; import CardMedia from "@mui/material/CardMedia"; import CardContent from "@mui/material/CardContent"; import CardActions from "@mui/material/CardActions"; import IconButton from "@mui/material/IconButton"; import DeleteIcon from "@mui/icons-material/Delete"; import StarBorderIcon from "@mui/icons-material/StarBorder"; import StarIcon from "@mui/icons-material/Star"; import Chip from "@mui/material/Chip"; import CheckBoxOutlineBlankIcon from "@mui/icons-material/CheckBoxOutlineBlank"; import HexagonOutlinedIcon from "@mui/icons-material/HexagonOutlined"; import ImageOutlinedIcon from "@mui/icons-material/ImageOutlined"; // Own import { anyAnnoationObject, bboxAnnotationObject, classAnnotationObject, imageCardObject, polygonAnnotationObject, } from "../util/types"; import { drawAnnotations } from "../util/drawAnnotations"; import { theme } from "../App"; import { useAppDispatch } from "../app/hooks"; import { updateImage, removeImage } from "../app/imageState"; import InferenceMenu from "./InferenceMenu"; // React import { useEffect, useRef, useState } from "react";
Input props
To mount our component, we need more information than just the basic props that are described by our imageCardObject
. Therefore we extend the type with a width
, height
, showActions
, and onClick
attribute:
export interface imageCardProps extends imageCardObject { width: number; height: number; showActions?: boolean; onClick?: () => void; } export default function ImageCard({ id, src, title, dateCreated, highlighted, annotations, width, height, showActions = true, onClick,
The width
and height
prop is used to create our image card with the correct dimension. onClick
is optional and expects a function, which we execute, when a user clicks on the image. We use it later to enlarge the card. The prop showActions
is enabled by default and determines if the inference button is visible to the user.
Hooks and variables
At first, we initialize our dispatch function. Then we define three different states for our component. The first one is called resizedSize
and contains the size of the displayed image according to the defined height
and width
of the image card. Images come in many different sizes, and to fit them to the screen, we need to resize them. Therefore, we use an array with two entries, where the first one depicts the width and the second one the height.
The next one (showLabel
) is an array that contains strings. We later implement a function where you can switch on or off the visibility of a label in the image. We initialize this state with all annotation ids. If an annotation id is not present in this array, we will display it on the image later.
The highlightedLabel
state is either a string (an id) or null. We use this state to highlight a label when a user hovers with the mouse over an annotation.
Lastly, we need to initialize the Refs
, that we use in this component. We split the image and the annotation into two HTML elements to avoid massive re-renderings. In this way, we only need to load the image once and can redraw the annotation updates and highlights separately.
// States and refs const dispatch = useAppDispatch(); const [resizeSize, setResizeSize] = useState([0, 0]); const [showLabel, setShowLabel] = useState<string[]>( annotations?.map((el) => { return el.id; }) ); const [highlightedLabel, setHighlightedLabel] = useState<string | null>(null); const annotationCanvasRef = useRef<HTMLCanvasElement>( document.createElement("canvas") ); const imageRef = useRef<HTMLCanvasElement>(document.createElement("canvas"));
Redux update
The state update is straightforward for the <ImageCard>
component. We wrap the update into the function updateInferenceResult()
, which it takes two arguments:
inferenceResult
: an array of annotations, which we add to the imageannotationIdToRemove
: an id of an annotation that we delete from the image
const updateInferenceResult = ( inferenceResult: anyAnnoationObject[], annotationIdToRemove: string | null = null ) => { var processedAnnotation = annotations; if (annotationIdToRemove) { processedAnnotation = annotations.filter( (el) => el.id !== annotationIdToRemove ); } dispatch( updateImage({ id: id, src: src, title: title, dateCreated: dateCreated, highlighted: highlighted, annotations: [...processedAnnotation, ...inferenceResult], }) ); setShowLabel([ ...showLabel, ...inferenceResult.map((el: anyAnnoationObject) => el.id), ]); };
The first part checks if an annotation should be deleted and removes the annotation accordingly by filtering out the annotation by its id. Afterward, we dispatch the updateImage
reducer with all the states inside the component. The only difference is that we merge the existing list of annotations with the new list of annotations: [...processedAnnotation, ...inferenceResult]
. Afterward, we need to update the showLabel
state to render new annotations immediately.
The rendering
<Card sx={{ margin: "auto", marginTop: 3 }}> <CardHeader sx={{ background: theme.palette.secondary.main }} action={ <IconButton aria-label="settings" onClick={() => { console.log(`${id} is deleted`); dispatch(removeImage({ id: id })); }} > <DeleteIcon /> </IconButton> } title={title} subheader={new Date(dateCreated).toDateString()} /> <CardMedia> <div style={{ width: resizeSize[0], height: resizeSize[1], position: "relative", marginRight: "auto", marginLeft: "auto", cursor: "pointer", }} onClick={onClick} > <canvas ref={imageRef} width={resizeSize[0]} height={resizeSize[1]} style={{ position: "absolute" }} /> <canvas ref={annotationCanvasRef} style={{ position: "absolute" }} width={resizeSize[0]} height={resizeSize[1]} /> </div> </CardMedia> <CardContent sx={{ background: theme.palette.background.paper, flexWrap: "wrap", overflow: "auto", width: resizeSize[0], }} > {renderAnnotationChips(annotations)} </CardContent> {showActions && ( <CardActions disableSpacing sx={{ position: "relative", background: theme.palette.secondary.main, }} > <IconButton aria-label="add to favorites" sx={{ marginRight: "auto" }} onClick={() => { dispatch( updateImage({ id: id, src: src, title: title, dateCreated: dateCreated, highlighted: !highlighted, annotations: annotations, }) ); }} > {highlighted ? <StarIcon /> : <StarBorderIcon />} </IconButton> <InferenceMenu src={src} updateAnnotation={updateInferenceResult} /> </CardActions> )} </Card>
Our <ImageCard>
returns an MUI Card, and most of it follows the tutorial from the official MUI page. The card header displays the name of the image as the title and the dateCreated
as a subtitle. Furthermore, the action of the <CardHeader>
is used to display a <DeleteIcon>
, which dispatches the removeImage
function.
An image and its annotations are displayed inside the <CardMedia>
component. I mentioned earlier that we want to split the image and label canvas to redraw the annotations without changing the image. Furthermore, we want to resize the image, which we do by drawing the image onto an HTML canvas. Thus, both canvases (image and annotation) should overlay. To achieve this, we use a simple <div>
as parent container, which has the expected image width and height of the image. We achieve this by setting width and height to resizeSize[0]
and resizeSize[1]
, respectively. Then we can set the position of both canvases to absolute
to orient each canvas at the top of the parent <div>
.
Our <CardContent>
is used to display the annotations. Furthermore, we want to highlight the bounding box if one hovers over the annotation. We wrap the described logic in a single function:
const renderAnnotationChips = (annotations: anyAnnoationObject[]) => { return annotations?.map( ( el: | bboxAnnotationObject | polygonAnnotationObject | classAnnotationObject ) => { return ( <Chip label={el.className} key={el.id} icon={ el.type === "bbox" ? ( <CheckBoxOutlineBlankIcon /> ) : el.type === "polygon" ? ( <HexagonOutlinedIcon /> ) : ( <ImageOutlinedIcon /> ) } sx={{ marginLeft: "5px", marginRight: "5px", backgroundColor: showLabel.includes(el.id) ? el.color : null, }} variant={showLabel.includes(el.id) ? "filled" : "outlined"} // Functions onClick={() => { if (showLabel?.includes(el.id)) { setShowLabel(showLabel.filter((id) => el.id !== id)); } else { setShowLabel([...showLabel, el.id]); } }} onDelete={() => updateInferenceResult([], el.id)} deleteIcon={<DeleteIcon />} onMouseOver={() => { setHighlightedLabel(el.id); }} onMouseOut={() => { setHighlightedLabel(null); }} /> ); } ); };
The function expects an array of annotations. For each annotation, the function renders a <Chip>
. Based on the annotation type, it renders a different icon. The label of the <Chip>
is the class name. If one clicks on a <Chip>
, the annotation should disappear/reappear on the canvas. Therefore, we remove/add the specific annotation id from the showLabels
state. To highlight the label on hover, we set the highlightLabel
state to the specific annotation id and remove it when the cursor moves out. With this <Chip>
it is also possible to delete an annotation by clicking on the delete icon. Therefore, we set the onDelete
and deleteIcon
property of the <Chip>
. The delete method is the updateInferenceResult
method, called with the specific annotation id, which we already discussed.
We use the <CardActions>
element to display a button for add-to-favorites and inference. To align them left and right, we set marginLeft
and marginRight
to auto
, respectively. When clicking on the button that adds an image to your favorites, we inverse the highlighted
state and dispatch the whole imageCardObject
. Regarding the inference button, we will go into implementation details in the next post about the inference.
Draw computer vision annotations in React
Until now, we discussed all JSX components and HTML tags used for the <ImageCard>
as well as the internal state and variables of the component. The only thing left is our interaction with the canvas. As mentioned before, we have one canvas for the image and a second canvas for the annotations.
Image canvas
Drawing images on a canvas is very basic. However, we want to resize the image to fit the card’s dimension. A canvas expects you to provide a start (in our case [0, 0]) and the maximum x and y value. To keep the number of redraws of the image as low as possible, we implement a useEffect
hook, which listens on the height,
widht
and src
prop of the component. Thus the image is only drawn if either the width or height changes or the image itself. Here is the code:
// Effects useEffect(() => { const img = new Image(); img.onload = (el: any) => { // Not optimal, however the solution does not work: https://www.kindacode.com/article/react-typescript-image-onload-onerror-events/ let newResizeSize; newResizeSize = [ el?.currentTarget?.width * ((height * 0.6) / el?.currentTarget?.height), height * 0.6, ]; if (newResizeSize[0] > width) { newResizeSize = [ width, el?.currentTarget?.height * (width / el?.currentTarget?.width), ]; } setResizeSize(newResizeSize); // Draw on image canvas imageRef?.current .getContext("2d") ?.drawImage(img, 0, 0, newResizeSize[0], newResizeSize[1]); }; img.src = src; }, [height, width, src]);
Regarding the workflow of the hook, we create a new HTMLImageElement
, define an onload
function that draws on the canvas and loads the image afterward by setting img.src
. The onload
function defines a variable called newResizeSize
. This contains the maximum x and y values for the resized image. We calculate them by defining a target size (in our case height*0.6) and calculate the respective rescale factor for the other side (width). We do this twice in the process. First, by scaling the image to 60% of the <Card>
height. Secondly, we check if the rescaled width is larger than the Cards width and if so, we resize the image to the width of the Card. After we determined the new image size, we changed its state (resizeSize
) accordingly and use drawImage()
from the canvas context to draw the image. We need the process to determine the maximum size of the drawn image while preserving the image’s aspect ratio.
Drawing annotations
We handle every redraw of annotations with a useEffect
hook. It watches the annotation
prop, the showLabel
and highlighteLabel
state as well as the resizeSize
state. We need to observe the resizeSize
because the dimensions ofannotationCanvasRef
depend on it. We separated the drawing logic on the canvas in the ./utils/drawAnnotations file. Thus, the code in the useEffect
function is small. It only checks whether the annotationCanvasRef
is not null and if the resizeSize
is already calculated and triggers the redraw accordingly:
useEffect(() => { if (resizeSize[0] !== 0 && annotationCanvasRef !== null) { drawAnnotations( annotations, annotationCanvasRef, resizeSize[0], resizeSize[1], highlightedLabel, showLabel ); } }, [annotations, resizeSize, showLabel, highlightedLabel]);
So let us have a look on the drawAnnoations
function:
import { RefObject } from "react"; import { anyAnnoationObject, bboxAnnotationObject, polygonAnnotationObject, } from "./types"; export const drawAnnotations = ( annotations: anyAnnoationObject[], canvas: RefObject<HTMLCanvasElement>, imgW: number, imgH: number, highlightAnnotationId: string | null, filter: string[] ) => { // Get context const ctx = canvas.current?.getContext("2d"); if (ctx) { ctx.clearRect(0, 0, imgW, imgH); ctx.lineWidth = 2; annotations.forEach((el) => { if (filter.includes(el.id)) { ctx.strokeStyle = el.color; if (el.type === "bbox") { drawBbox(ctx, el, imgW, imgH, highlightAnnotationId); } if (el.type === "polygon") { drawPolygon(ctx, el, imgW, imgH, highlightAnnotationId); } } }); } };
Since we use basic HTML and JS in this function, we only need to import some types to comply with TypeScript. The main function drawAnnotations
takes an array of annotations, the canvas Ref, the image dimensions, a highlighted label string, and a filter as input. At the start of the function, we retrieve the canvas’s context and clear the canvas. Afterward, we iterate over each annotation and execute the specific draw function for each annotation type (bbox or polygon), if the annotation.id
is in the filter
array. The draw functions are pretty similar and only differ by the method of the context we use:
const drawBbox = ( ctx: CanvasRenderingContext2D, annotation: bboxAnnotationObject, imgW: number, imgH: number, highlightAnnotationId: string | null ) => { ctx.beginPath(); ctx.rect( annotation.box[0] * imgW, annotation.box[1] * imgH, annotation.box[2] * imgW, -annotation.box[3] * imgH ); if (highlightAnnotationId === annotation.id) { ctx.fillStyle = annotation.color + "4D"; ctx.fill(); } ctx.stroke(); }; const drawPolygon = ( ctx: CanvasRenderingContext2D, annotation: polygonAnnotationObject, imgW: number, imgH: number, highlightAnnotationId: string | null ) => { ctx.beginPath(); ctx.moveTo(annotation.polygon[0][0] * imgW, annotation.polygon[0][1] * imgH); annotation.polygon.forEach((el: number[], idx: number) => { if (idx === 0) { ctx.moveTo(el[0] * imgW, (1 - el[1]) * imgH); } else { ctx.lineTo(el[0] * imgW, (1 - el[1]) * imgH); } }); if (highlightAnnotationId === annotation.id) { ctx.fillStyle = annotation.color + "4D"; ctx.fill(); } ctx.closePath(); ctx.stroke(); };
At first, we execute ctx.beginPath
to ensure that we start to draw a new object. Afterward, we draw the shape. For bounding boxes, we can use ctx.rect
to draw a rectangle by defining x1, y1, width, and height. The ctx.rect
function expects you to provide the starting point from the top-left point and specify the width and height of a rectangle. When you are familiar with python frameworks for computer vision, most of the frameworks expect you to specify the bottom left point as the starting point. We stick to the python convention to store annotations. By adding a minus to the height, JS allows us to reverse the height and draw from bottom to top. Thus we can specify the bottom left as starting point.
I like to store points in images like Fiftyone. They store points in relative values from 0 to 1. Thus, independent from the image size, which are not affected by resizing, if you keep the aspect ratio.
To draw a polygon, we iterate over each (x, y) coordinate and call ctx.lineTo
, except for the first point, where we call ctx.moveTo
to define a starting point.
After we draw the shape (bbox or polygon), we fill the shape, if the annoation.id
is equal to the highlightedAnnoationId
. Therefore, we add “4D” to the hex value of the annotation.color
to set an alpha value of 30% to the color. Lastly, we call ctx.fill
and ctx.stroke
to apply the changes to the canvas.
An image list for our computer vision react app
Our React app for computer vision should display a list of <ImageCards>
on the home screen. At first, I wanted to use <ImageList>
from MUI. It is perfect for plain images but does not integrate with custom components, like <ImageCard>
. Thus, we create a list with good old divs and CSS. The component is called <CustomImageList>
and you can find in under ./components/CustomImageList.tsx:
import Modal from "@mui/material/Modal"; import Container from "@mui/material/Container"; import ImageCard, { imageCardProps } from "./ImageCard"; import { useAppSelector } from "../app/hooks"; import { imageCardObject } from "../util/types"; import { useState } from "react"; export default function CustomImageList() { const [modalCardId, setModalCardId] = useState<null | string>(null); const images = useAppSelector((state) => state.images.images); return ( <> <div style={{ display: "flex", flexWrap: "wrap", justifyContent: "center", justifyItems: "center", alignContent: "center", }} > {images.map((image: imageCardObject) => { const props: imageCardProps = { ...image, width: window.innerWidth * 0.49, height: window.innerHeight * 0.49, onClick: () => { setModalCardId(image.id); }, }; return ( <ImageCard {...props} key={`${props.id}-image-card-in-list`} /> ); })} </div> <Modal open={modalCardId !== null} onClose={() => setModalCardId(null)}> <Container> {modalCardId !== null && images .filter((el: imageCardObject) => el.id === modalCardId) .map((el: imageCardObject) => { return ( <ImageCard {...{ ...el, width: window.innerWidth * 0.8, height: window.innerHeight * 0.8, }} key={`${el.id}-modal-card`} /> ); })} </Container> </Modal> </> ); }
The component has two functions. The first one is to display all the <ImageCard>
s and the second one is to enlarge a specific <ImageCard>
if someone clicks on the image. We use a <div>
and set its display
to flex
and flexWrap
to wrap
to create a list. Then we center all items and our <ImageCard>
s appear in a list. We fetch the redux state for the images and iterate over each one to create an <ImageCard>
component.
To enlarge the image, we use a <Modal>
component from MUI. To trigger the modal, we use the modalCardId
state that contains either null (modal not open) or a string, which should be an id of an imageCardObject
. If it is set to an id of a card, it opens and shows the enlarged card. The card is wrapped inside an MUI <Container>
to center it.
Create an entry point for images
Until now, we assumed that we have already images in our React app for computer vision. However, a user needs some functionality to insert images. Luckily, we can use our <ImageCard>
component to visualize an image. Then, we add a save and discard button on the bottom, and we have a nice interface to insert images. We achieve this with this component under src/components/AddContentFab.tsx:
// MUI import Fab from "@mui/material/Fab"; import CameraIcon from "@mui/icons-material/Camera"; import Modal from "@mui/material/Modal"; import Button from "@mui/material/Button"; import Grid from "@mui/material/Grid"; // Other import { useRef, useState } from "react"; import { v4 as uuidv4 } from "uuid"; import ImageCard from "./ImageCard"; import { useAppDispatch } from "../app/hooks"; import { addImage } from "../app/imageState"; import { imageCardObject } from "../util/types"; export default function AddContentFab() { // States, Refs & vars const [takenImage, setTakenImage] = useState<null | string>(null); const [imageName, setImageName] = useState<null | string>(null); const [dateCreated, setDateCreated] = useState<null | string>(null); const inputFileRef = useRef<null | HTMLInputElement>(null); const dispatch = useAppDispatch(); // Click functions const handleModalClose = () => { setImageName(null); setTakenImage(null); }; const resizeImageToMaximum2Mb = (img: string) => { if (img.length > 1_000_000) { const rescaleRatio = 1_000_000 / img.length; console.log(`Rescale ratio: ${rescaleRatio}`); const tmp_img = new Image(); tmp_img.onload = () => { const rescaledSize = [ Math.floor(tmp_img.width * rescaleRatio), Math.floor(tmp_img.height * rescaleRatio), ]; const canvas = document.createElement("canvas"); canvas.width = rescaledSize[0]; canvas.height = rescaledSize[1]; const ctx = canvas.getContext("2d"); ctx?.drawImage(tmp_img, 0, 0, rescaledSize[0], rescaledSize[1]); setTakenImage(canvas.toDataURL("image/jpeg", 1)); }; tmp_img.src = img; } else { setTakenImage(img); } }; const saveImage = () => { const payload: imageCardObject = { id: uuidv4(), src: String(takenImage), title: String(imageName), dateCreated: String(dateCreated), highlighted: false, annotations: [], }; dispatch(addImage(payload)); handleModalClose(); }; return ( <> <Fab sx={{ position: "fixed", bottom: 16, right: 16 }} onClick={() => { if (inputFileRef.current !== null) { inputFileRef.current.click(); } }} > <CameraIcon /> </Fab> <Modal open={takenImage !== null} onClose={handleModalClose}> <Grid container spacing={2}> <Grid item xs={12}> <ImageCard id={uuidv4()} src={takenImage !== null ? takenImage : ""} annotations={[]} width={window.innerWidth * 0.8} height={window.innerHeight * 0.8} title={imageName ? imageName : "Your new image"} dateCreated={String(dateCreated)} highlighted={false} showActions={false} /> </Grid> <Grid item xs={6} sx={{ display: "flex", justifyContent: "center", alignItems: "center", }} > <Button variant="contained" size="large" onClick={handleModalClose}> Discard </Button> </Grid> <Grid item xs={6} sx={{ display: "flex", justifyContent: "center", alignItems: "center", }} > <Button variant="contained" size="large" onClick={saveImage}> Save </Button> </Grid> </Grid> </Modal> <input ref={inputFileRef} type="file" style={{ display: "none" }} accept="image/*" onChange={(inp) => { if (inp.currentTarget.files !== null) { const reader = new FileReader(); reader.onload = (e) => { resizeImageToMaximum2Mb(String(e.target?.result)); }; setImageName(inp?.currentTarget.files[0].name); setDateCreated(String(new Date())); reader.readAsDataURL(inp.currentTarget.files[0]); } }} /> </> ); }
As imports, we need some MUI components, our redux state, and our <ImageCard>
component. Our state and variables describe, for the most part, the new image, like dateCreated
, imageName
, and the image itself: takenImage
. These are all set when a user selects an image. Then, we have a Ref
, that holds an HTMLInputElement
. We need this later to open the browser input menu for images when the user clicks on the floating action button.
Before we get to the return statement, we define three functions. The first one, handleModalClose
, resets the state of the image name and the image itself. We wrap this because we need to call this from several points in the component. We use the second function (saveImage
) to save the image. It constructs an imageCardObject
and dispatches it to the redux state. We need the last function to resize images if they are too large. Remember that we save all images to the local state. However, some browsers have a maximum quota for the local storage, mostly ranging between 2 – 10 MB. Since the local storage stores only UTF-16 encoded strings, each character needs two bytes. Thus, we set the maximum number of characters to one million, equal to 2 MB. The function resizeImageToMaximum2Mb
checks if the image source is larger than one million characters and, if so, calculates the ratio to resize the string to one million characters by dividing one million by the current length of the image source. With this rescaleRatio
the image is loaded and resized by drawing it on a canvas, which is not rendered in the DOM. Afterward, the canvas is converted as DataURL and saved to the internal takenImage
state. This function allows users to take photos with their smartphone (usually more than 8 MB) and still use them inside the app.
Now let’s look at the rendered elements of the component. At first, we define our floating action button. We want to stick the action button on the bottom right side of the application. Therefore, we set the position
to fixed
and add some margin to the bottom
and right
. As an icon, we use the official CameraIcon
from MUI. As onClick
method, we trigger the click method of our inputRef
mentioned earlier. The corresponding HTMLInputElement
is defined at the end of the component and the display
CSS argument is set to none
, to make it invisible. The type
is set to file
and accept
to image/*
. With this setting, the user can only select one image from his device or take a picture if he uses a smartphone. Then, we define an onChange
method for the input element that creates a new FileReader
, which executes the resizeImageToMaximum1Mb
function after loading to the corresponding DataURL
of the image (base64 encoded). Afterward, we try to set the imageName
and the dateCreated
states.
If the state takenImage
is not null, the <Modal>
opens, and the image is displayed. The <Modal>
incorporates an <ImageCard>
and our save and discard buttons. To arrange these components, we use MUI’s <Grid>
component. First, we create a grid with the container
prop, and afterward, we create an item <Grid>
to incorporate the <ImageCard>
with the xs prop set to 12 to stretch it over the whole width. The buttons are both encapsulated in <Grid>
(with the item
prop set) with an xs
value of 6. to place them in one row.
Outlook on our computer vision app in React
This concludes the first of two posts, where we build an app that executes computer vision models entirely on the client-side. Again you can find all the code here and visit the app under this link. I will add an inference option with ONNXruntime for JS for the next post. Stay tuned.
If you are interested in ONNXruntime in python, check out my other post about object detection in AWS Lambda.
[…] React for computer vision […]