Questions tagged [machine-learning]

How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?

From The Discipline of Machine Learning by Tom Mitchell:

The field of Machine Learning seeks to answer the question "How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?" This question covers a broad range of learning tasks, such as how to design autonomous mobile robots that learn to navigate from their own experience, how to data mine historical medical records to learn which future patients will respond best to which treatments, and how to build search engines that automatically customize to their user's interests. To be more precise, we say that a machine learns with respect to a particular task T, performance metric P, and type of experience E, if the system reliably improves its performance P at task T, following experience E. Depending on how we specify T, P, and E, the learning task might also be called by names such as data mining, autonomous discovery, database updating, programming by example, etc.

3322 questions

votes

2 answers

Perceptron find weight exercise

I have some difficulties with the following exercise. There are three different diagrams. If possible, ﬁnd the perceptron-weights $w_0, w_1,$ and $w_2$ for each of them (the decision surface is clearly divided into two regions, one ”positive” the…

machine-learning

asked Feb 02 '14 at 15:42

user16168

votes

2 answers

Compressing the Mandelbrot set

This question may not have a definitive answer. However, if someone is able to illuminate the topic for me, I would be very grateful. The Mandelbrot set is the set obtained from the quadratic recurrence…

machine-learning

asked May 30 '11 at 16:14

Ncarlson

votes

2 answers

GAN Nash equilibrium

I'm reading Ian Goodfellow’s article about Generative Adversarial Networks (https://arxiv.org/pdf/1701.00160.pdf) and, on page 22, I found a sentence that I don’t understand. It’s about the GAN convergence evaluated with the Nash game…

machine-learning

asked Jun 29 '19 at 14:36

Fabrizio R.

votes

1 answer

Cross-Entropy loss in Reinforcement Learning

In the context of supervised learning for classification using neural networks, when we are identifying the performance of an algorithm we can use cross-entropy loss, given by: $$ L = -\sum_1^n log(\pi (f(x_i))_{y_i}) $$ Where $x_i$ is a vector…

machine-learning

asked Apr 10 '18 at 12:46

Michael Murray

votes

2 answers

For a PAC learnable hypothesis Show that its sample complexity $m_{\mathcal{H}}$ is monotonically non-increasing in each of its parameters

Not sure if this is the right place to post this, if this isn't i'll be grateful if someone will direct me where best to post it. I'm independently taking the course Introduction to Machine language (as in, doing it by myself) using the book:…

machine-learning

asked Apr 15 '15 at 10:34

user475680

votes

2 answers

VC dimension of perpendicular lines classifier

I was learning about VC dimension, and I saw an example in the "Introduction to Machine learning" that the VC dimension of a rectangle is 4. I'm just curious about VC-dimension of two perpendicular lines. I try to shatter some points but I'm not…

machine-learning

asked Oct 25 '22 at 22:37

zahra Hosseini

votes

1 answer

Equation (3.89) seems wrong in Bishop pattern recognition & machine learning book

In Bishop's pattern recognition & machine learning book, I seem to have found a serious mistake in an math equation; serious because all subsequent arguments rely on it. It is the eq. (3.89) on page 168: $$ 0 = \frac{M}{2\alpha}…

machine-learning

asked Jul 31 '20 at 06:47

Royalblue

votes

1 answer

Normalized distance from origin to discriminant function for linear classifiers

I'm currently studying machine learning with the book Pattern Recognition and Machine Learning (Bishop, 2006) and had a question regarding finding the distance between the origin and a linear discriminant function. For anyone curious, this is from…

machine-learning

asked Jan 13 '20 at 00:16

Sean

1,487

votes

0 answers

cross entropy for binary or multiclass classification

I'm building a NN classifier to predict if a sample is of class 1 or 0. I'm trying 3 differents network configuration: One unit in the output layer with sigmoid activation function Two units in the output layer with sigmoid activation function Two…

machine-learning

asked Oct 26 '19 at 13:02

cylon86

votes

1 answer

How does one code the generative adversarial network loss function?

I was reading Ian Goodfellow paper on GAN and I read that the loss function for GANs are : $J^{(G)} = -J^{(J)} = \frac{1}{2} \mathbb{E}_{x \sim p_{\rm data}}\Big[ \log D(x)\Big] + \frac{1}{2} \mathbb{E}_{z} \Big[\log (1-D(G(z)))\Big]$ I saw a few…

machine-learning

asked Jul 15 '18 at 12:48

Hoda Fakharzadeh

votes

1 answer

Notation in the derivative of the hinge loss function

The hinge loss function (summed over $m$ examples): $$ l(w)= \sum_{i=1}^{m} \max\{0 ,1-y_i(w^{\top} \cdot x_i)\} $$ My calculation of the subgradient for a single component and example is: $$ l(z) = \max\{0, 1 - yz\} $$ $$ l^{\prime}(z) = \max\{0, -…

machine-learning

asked Jan 18 '17 at 00:29

jds

2,274
3
24
35

votes

0 answers

Michael Nielsen's book “Neural Networks and Deep Learning” Cauchy-Schwarz Inequality Proof

In the online free book the following is stated: If $C$ is a cost function which depends on $v1,v2,...,vn$ he states that we make a move in the $Δv$ direction to decrease $C$ as much as possible, and that's equivalent to minimizing $ΔC≈∇C⋅Δv$. So if…

machine-learning

asked Aug 28 '16 at 18:15

par

votes

4 answers

VC-Dimension of Real Linear Classifier Proof

Does anyone have know or have a link to a proof of why the VC-Dimension of Linear Classifiers in $\mathbb{R}^n$ is $n+1$? That is the set of $h_a : \mathbb{R}^n \rightarrow \{-1,1\}, h_a(b) = sgn(a \cdot b + k)$ where $a,b \in \mathbb{R}^n, k \in…

machine-learning

asked Oct 31 '15 at 23:56

Math is Hard

votes

3 answers

How does kernel work work

all, I have been learning kernel method for a long time. But I am still not very sure how it works. In my opinion, it works as follows: say $f(x) = \sum_i\alpha_ik(x_i, x)$. First we need to decide which kernel we should use. The common one is the…

machine-learning

asked Jul 13 '15 at 20:47

tqjustc

votes

1 answer

Gaussian Process Regression

Observations: $$ X= \begin{pmatrix} x_1 \\ x_2 \\ \end{pmatrix} = \begin{pmatrix} 0 & 1 \\ 0.5 & 2 \\ \end{pmatrix} $$ $$ y= \begin{pmatrix} y_1 \\ y_2 \\ …

machine-learning

asked Dec 15 '14 at 16:36

Xxx

Prev 1

…

14 15 Next